[jira] [Commented] (HIVE-11037) HiveOnTez: make explain user level = true as default
[ https://issues.apache.org/jira/browse/HIVE-11037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594437#comment-14594437 ] Hive QA commented on HIVE-11037: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12740793/HIVE-11037.05.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9012 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4330/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4330/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4330/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12740793 - PreCommit-HIVE-TRUNK-Build > HiveOnTez: make explain user level = true as default > > > Key: HIVE-11037 > URL: https://issues.apache.org/jira/browse/HIVE-11037 > Project: Hive > Issue Type: Improvement >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-11037.01.patch, HIVE-11037.02.patch, > HIVE-11037.03.patch, HIVE-11037.04.patch, HIVE-11037.05.patch > > > In Hive-9780, we introduced a new level of explain for hive on tez. We would > like to make it running by default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11044) Some optimizable predicates being missed by constant propagation
[ https://issues.apache.org/jira/browse/HIVE-11044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594397#comment-14594397 ] Hive QA commented on HIVE-11044: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12740749/HIVE-11044.2.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9011 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join28 {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4329/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4329/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4329/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12740749 - PreCommit-HIVE-TRUNK-Build > Some optimizable predicates being missed by constant propagation > > > Key: HIVE-11044 > URL: https://issues.apache.org/jira/browse/HIVE-11044 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Jason Dere >Assignee: Jason Dere > Attachments: HIVE-11044.1.patch, HIVE-11044.2.patch > > > Some of the qfile explain plans show some predicates that could be taken care > of by running ConstantPropagate after the PartitionPruner: > index_auto_unused.q: > {noformat} > filterExpr: ((12.0 = 12.0) and (UDFToDouble(key) < 10.0)) (type: boolean) > {noformat} > join28.q: > {noformat} > predicate: ((11.0 = 11.0) and key is not null) (type: boolean) > {noformat} > bucketsort_optimize_insert_7.q ("is not null" is unnecessary) > {noformat} > predicate: (((key < 8) and key is not null) and ((key = 0) or (key = 5))) > (type: boolean) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-7193) Hive should support additional LDAP authentication parameters
[ https://issues.apache.org/jira/browse/HIVE-7193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594389#comment-14594389 ] Lefty Leverenz commented on HIVE-7193: -- The parameter descriptions in patch 5 look good. Just one nit unfixed: "return" should be "returns" in the description of hive.server2.authentication.ldap.customLDAPQuery. Thanks. > Hive should support additional LDAP authentication parameters > - > > Key: HIVE-7193 > URL: https://issues.apache.org/jira/browse/HIVE-7193 > Project: Hive > Issue Type: Bug >Affects Versions: 0.10.0 >Reporter: Mala Chikka Kempanna >Assignee: Naveen Gangam > Attachments: HIVE-7193.2.patch, HIVE-7193.3.patch, HIVE-7193.4.patch, > HIVE-7193.5.patch, HIVE-7193.patch, LDAPAuthentication_Design_Doc.docx, > LDAPAuthentication_Design_Doc_V2.docx > > > Currently hive has only following authenticator parameters for LDAP > authentication for hiveserver2: > {code:xml} > > hive.server2.authentication > LDAP > > > hive.server2.authentication.ldap.url > ldap://our_ldap_address > > {code} > We need to include other LDAP properties as part of hive-LDAP authentication > like below: > {noformat} > a group search base -> dc=domain,dc=com > a group search filter -> member={0} > a user search base -> dc=domain,dc=com > a user search filter -> sAMAAccountName={0} > a list of valid user groups -> group1,group2,group3 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11037) HiveOnTez: make explain user level = true as default
[ https://issues.apache.org/jira/browse/HIVE-11037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-11037: --- Attachment: HIVE-11037.05.patch address the newly added tez_self_join.q test case. > HiveOnTez: make explain user level = true as default > > > Key: HIVE-11037 > URL: https://issues.apache.org/jira/browse/HIVE-11037 > Project: Hive > Issue Type: Improvement >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-11037.01.patch, HIVE-11037.02.patch, > HIVE-11037.03.patch, HIVE-11037.04.patch, HIVE-11037.05.patch > > > In Hive-9780, we introduced a new level of explain for hive on tez. We would > like to make it running by default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11059) hcatalog-server-extensions tests scope should depend on hive-exec
[ https://issues.apache.org/jira/browse/HIVE-11059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594342#comment-14594342 ] Hive QA commented on HIVE-11059: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12740748/HIVE-11059.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9004 tests executed *Failed tests:* {noformat} TestSSL - did not produce a TEST-*.xml file org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchAbort {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4328/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4328/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4328/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12740748 - PreCommit-HIVE-TRUNK-Build > hcatalog-server-extensions tests scope should depend on hive-exec > - > > Key: HIVE-11059 > URL: https://issues.apache.org/jira/browse/HIVE-11059 > Project: Hive > Issue Type: Bug > Components: Tests >Affects Versions: 1.2.1 >Reporter: Sushanth Sowmyan >Assignee: Sushanth Sowmyan >Priority: Minor > Attachments: HIVE-11059.patch > > > (causes test failures in Windows due to the lack of WindowsPathUtil being > available otherwise) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11057) HBase metastore chokes on partition with ':' in name
[ https://issues.apache.org/jira/browse/HIVE-11057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594268#comment-14594268 ] Hive QA commented on HIVE-11057: {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12740711/HIVE-11057.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4327/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4327/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4327/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]] + export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + export PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4327/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + cd apache-github-source-source + git fetch origin >From https://github.com/apache/hive 74fe6f7..379cd85 branch-1 -> origin/branch-1 + git reset --hard HEAD HEAD is now at b97303c HIVE-11050: testCliDriver_vector_outer_join.* failures in Unit tests due to unstable data creation queries (Matt McCline reviewed by Gunther Hagleitner) + git clean -f -d + git checkout master Already on 'master' + git reset --hard origin/master HEAD is now at b97303c HIVE-11050: testCliDriver_vector_outer_join.* failures in Unit tests due to unstable data creation queries (Matt McCline reviewed by Gunther Hagleitner) + git merge --ff-only origin/master Already up-to-date. + git gc + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12740711 - PreCommit-HIVE-TRUNK-Build > HBase metastore chokes on partition with ':' in name > > > Key: HIVE-11057 > URL: https://issues.apache.org/jira/browse/HIVE-11057 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: hbase-metastore-branch >Reporter: Alan Gates >Assignee: Alan Gates > Fix For: hbase-metastore-branch > > Attachments: HIVE-11057.patch > > > The HBase metastore uses ':' as a key separator when building keys for the > partition table. This means that partitions with a colon in the name (which > is legal) cause problems. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11042) Need fix Utilities.replaceTaskId method
[ https://issues.apache.org/jira/browse/HIVE-11042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594266#comment-14594266 ] Hive QA commented on HIVE-11042: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12740740/HIVE-11042.3.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9012 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4326/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4326/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4326/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12740740 - PreCommit-HIVE-TRUNK-Build > Need fix Utilities.replaceTaskId method > --- > > Key: HIVE-11042 > URL: https://issues.apache.org/jira/browse/HIVE-11042 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 1.2.0 >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen > Attachments: HIVE-11042.1.patch, HIVE-11042.2.patch, > HIVE-11042.3.patch > > > When I are looking at other bug, I found Utilities.replaceTaskId (String, > int) method is not right. > For example > Utilities.replaceTaskId"(ds%3D1)01", 5); > return 5 > It should return (ds%3D1)05 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10233) Hive on tez: memory manager for grace hash join
[ https://issues.apache.org/jira/browse/HIVE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-10233: -- Attachment: HIVE-10233.10.patch > Hive on tez: memory manager for grace hash join > --- > > Key: HIVE-10233 > URL: https://issues.apache.org/jira/browse/HIVE-10233 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: llap, 2.0.0 >Reporter: Vikram Dixit K >Assignee: Vikram Dixit K > Attachments: HIVE-10233-WIP-2.patch, HIVE-10233-WIP-3.patch, > HIVE-10233-WIP-4.patch, HIVE-10233-WIP-5.patch, HIVE-10233-WIP-6.patch, > HIVE-10233-WIP-7.patch, HIVE-10233-WIP-8.patch, HIVE-10233.08.patch, > HIVE-10233.09.patch, HIVE-10233.10.patch > > > We need a memory manager in llap/tez to manage the usage of memory across > threads. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-11050) testCliDriver_vector_outer_join.* failures in Unit tests due to unstable data creation queries
[ https://issues.apache.org/jira/browse/HIVE-11050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran resolved HIVE-11050. -- Resolution: Fixed Committed to branch-1 as well. Thanks [~mmccline]! > testCliDriver_vector_outer_join.* failures in Unit tests due to unstable data > creation queries > -- > > Key: HIVE-11050 > URL: https://issues.apache.org/jira/browse/HIVE-11050 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.2.1 >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Blocker > Fix For: 1.2.1 > > Attachments: HIVE-11050.01.branch-1.patch, HIVE-11050.01.patch > > > In some environments the Q file tests vector_outer_join\{1-4\}.q fail because > the data creation queries produce different input files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11050) testCliDriver_vector_outer_join.* failures in Unit tests due to unstable data creation queries
[ https://issues.apache.org/jira/browse/HIVE-11050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594213#comment-14594213 ] Matt McCline commented on HIVE-11050: - I could not apply the patch to branch-1 either. I recreated the changes using a difftool and attached that patch as HIVE-11050.01.branch-1.patch I tried to commit but don't have permissions for that branch? [~ prasanthj] Can you try committing the alternate patch? Thanks. > testCliDriver_vector_outer_join.* failures in Unit tests due to unstable data > creation queries > -- > > Key: HIVE-11050 > URL: https://issues.apache.org/jira/browse/HIVE-11050 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.2.1 >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Blocker > Fix For: 1.2.1 > > Attachments: HIVE-11050.01.branch-1.patch, HIVE-11050.01.patch > > > In some environments the Q file tests vector_outer_join\{1-4\}.q fail because > the data creation queries produce different input files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11050) testCliDriver_vector_outer_join.* failures in Unit tests due to unstable data creation queries
[ https://issues.apache.org/jira/browse/HIVE-11050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-11050: Attachment: HIVE-11050.01.branch-1.patch > testCliDriver_vector_outer_join.* failures in Unit tests due to unstable data > creation queries > -- > > Key: HIVE-11050 > URL: https://issues.apache.org/jira/browse/HIVE-11050 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.2.1 >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Blocker > Fix For: 1.2.1 > > Attachments: HIVE-11050.01.branch-1.patch, HIVE-11050.01.patch > > > In some environments the Q file tests vector_outer_join\{1-4\}.q fail because > the data creation queries produce different input files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11037) HiveOnTez: make explain user level = true as default
[ https://issues.apache.org/jira/browse/HIVE-11037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594150#comment-14594150 ] Hive QA commented on HIVE-11037: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12740685/HIVE-11037.03.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9012 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_self_join {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4325/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4325/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4325/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12740685 - PreCommit-HIVE-TRUNK-Build > HiveOnTez: make explain user level = true as default > > > Key: HIVE-11037 > URL: https://issues.apache.org/jira/browse/HIVE-11037 > Project: Hive > Issue Type: Improvement >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-11037.01.patch, HIVE-11037.02.patch, > HIVE-11037.03.patch, HIVE-11037.04.patch > > > In Hive-9780, we introduced a new level of explain for hive on tez. We would > like to make it running by default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11060) Make test windowing.q robust
[ https://issues.apache.org/jira/browse/HIVE-11060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594148#comment-14594148 ] Jesus Camacho Rodriguez commented on HIVE-11060: Thanks Ashutosh! I just did it. > Make test windowing.q robust > > > Key: HIVE-11060 > URL: https://issues.apache.org/jira/browse/HIVE-11060 > Project: Hive > Issue Type: Bug > Components: Tests >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-11060.01.patch, HIVE-11060.patch > > > Add partition / order by in over clause to make result set deterministic. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11060) Make test windowing.q robust
[ https://issues.apache.org/jira/browse/HIVE-11060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-11060: --- Attachment: HIVE-11060.01.patch > Make test windowing.q robust > > > Key: HIVE-11060 > URL: https://issues.apache.org/jira/browse/HIVE-11060 > Project: Hive > Issue Type: Bug > Components: Tests >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-11060.01.patch, HIVE-11060.patch > > > Add partition / order by in over clause to make result set deterministic. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11050) testCliDriver_vector_outer_join.* failures in Unit tests due to unstable data creation queries
[ https://issues.apache.org/jira/browse/HIVE-11050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-11050: - Fix Version/s: (was: 2.0.0) > testCliDriver_vector_outer_join.* failures in Unit tests due to unstable data > creation queries > -- > > Key: HIVE-11050 > URL: https://issues.apache.org/jira/browse/HIVE-11050 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.2.1 >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Blocker > Fix For: 1.2.1 > > Attachments: HIVE-11050.01.patch > > > In some environments the Q file tests vector_outer_join\{1-4\}.q fail because > the data creation queries produce different input files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (HIVE-11050) testCliDriver_vector_outer_join.* failures in Unit tests due to unstable data creation queries
[ https://issues.apache.org/jira/browse/HIVE-11050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran reopened HIVE-11050: -- [~mmccline] Patch does not apply cleanly in branch-1. Reopening the issue. > testCliDriver_vector_outer_join.* failures in Unit tests due to unstable data > creation queries > -- > > Key: HIVE-11050 > URL: https://issues.apache.org/jira/browse/HIVE-11050 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.2.1 >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Blocker > Fix For: 1.2.1, 2.0.0 > > Attachments: HIVE-11050.01.patch > > > In some environments the Q file tests vector_outer_join\{1-4\}.q fail because > the data creation queries produce different input files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11033) BloomFilter index is not honored by ORC reader
[ https://issues.apache.org/jira/browse/HIVE-11033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-11033: - Fix Version/s: (was: 2.0.0) > BloomFilter index is not honored by ORC reader > -- > > Key: HIVE-11033 > URL: https://issues.apache.org/jira/browse/HIVE-11033 > Project: Hive > Issue Type: Bug >Affects Versions: 1.2.0 >Reporter: Allan Yan >Assignee: Prasanth Jayachandran > Fix For: 1.2.1 > > Attachments: HIVE-11033.2.patch, HIVE-11033.patch > > > There is a bug in the org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl class > which caused the bloom filter index saved in the ORC file not being used. The > root cause is the bloomFilterIndices variable defined in the SargApplier > class superseded the one defined in its parent class. Therefore, in the > ReaderImpl.pickRowGroups() > {code} > protected boolean[] pickRowGroups() throws IOException { > // if we don't have a sarg or indexes, we read everything > if (sargApp == null) { > return null; > } > readRowIndex(currentStripe, included, sargApp.sargColumns); > return sargApp.pickRowGroups(stripes.get(currentStripe), indexes); > } > {code} > The bloomFilterIndices populated by readRowIndex() is not picked up by > sargApp object. One solution is to make SargApplier.bloomFilterIndices a > reference to its parent counterpart. > {noformat} > 18:46 $ diff src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java > src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java.original > 174d173 > < bloomFilterIndices = new OrcProto.BloomFilterIndex[types.size()]; > 178c177 > < sarg, options.getColumnNames(), strideRate, types, > included.length, bloomFilterIndices); > --- > > sarg, options.getColumnNames(), strideRate, types, > > included.length); > 204a204 > > bloomFilterIndices = new OrcProto.BloomFilterIndex[types.size()]; > 673c673 > < List types, int includedCount, > OrcProto.BloomFilterIndex[] bloomFilterIndices) { > --- > > List types, int includedCount) { > 677c677 > < this.bloomFilterIndices = bloomFilterIndices; > --- > > bloomFilterIndices = new OrcProto.BloomFilterIndex[types.size()]; > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11031) ORC concatenation of old files can fail while merging column statistics
[ https://issues.apache.org/jira/browse/HIVE-11031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-11031: - Fix Version/s: (was: 2.0.0) (was: 1.1.1) (was: 1.0.1) > ORC concatenation of old files can fail while merging column statistics > --- > > Key: HIVE-11031 > URL: https://issues.apache.org/jira/browse/HIVE-11031 > Project: Hive > Issue Type: Bug >Affects Versions: 0.13.0, 0.14.0, 1.0.0, 1.2.0, 1.1.0, 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Fix For: 1.2.1 > > Attachments: HIVE-11031-branch-1.0.patch, HIVE-11031.2.patch, > HIVE-11031.3.patch, HIVE-11031.4.patch, HIVE-11031.patch > > > Column statistics in ORC are optional protobuf fields. Old ORC files might > not have statistics for newly added types like decimal, date, timestamp etc. > But column statistics merging assumes column statistics exists for these > types and invokes merge. For example, merging of TimestampColumnStatistics > directly casts the received ColumnStatistics object without doing instanceof > check. If the ORC file contains time stamp column statistics then this will > work else it will throw ClassCastException. > Also, the file merge operator swallows the exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11035) PPD: Orc Split elimination fails because filterColumns=[-1]
[ https://issues.apache.org/jira/browse/HIVE-11035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-11035: - Fix Version/s: (was: 2.0.0) (was: 1.1.1) (was: 1.0.1) > PPD: Orc Split elimination fails because filterColumns=[-1] > --- > > Key: HIVE-11035 > URL: https://issues.apache.org/jira/browse/HIVE-11035 > Project: Hive > Issue Type: Bug >Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 1.3.0, 2.0.0 >Reporter: Gopal V >Assignee: Prasanth Jayachandran > Fix For: 1.2.1 > > Attachments: HIVE-11035-branch-1.0.patch, HIVE-11035.patch > > > {code} > create temporary table xx (x int) stored as orc ; > insert into xx values (20),(200); > set hive.fetch.task.conversion=none; > select * from xx where x is null; > {code} > This should generate zero tasks after optional split elimination in the app > master, instead of generating the 1 task which for sure hits the row-index > filters and removes all rows anyway. > Right now, this runs 1 task for the stripe containing (min=20, max=200, > has_null=false), which is broken. > Instead, it returns YES_NO_NULL from the following default case > https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java#L976 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10685) Alter table concatenate oparetor will cause duplicate data
[ https://issues.apache.org/jira/browse/HIVE-10685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-10685: - Fix Version/s: (was: 2.0.0) (was: 1.1.1) (was: 1.0.1) > Alter table concatenate oparetor will cause duplicate data > -- > > Key: HIVE-10685 > URL: https://issues.apache.org/jira/browse/HIVE-10685 > Project: Hive > Issue Type: Bug >Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 1.3.0, 1.2.1 >Reporter: guoliming >Assignee: guoliming >Priority: Critical > Fix For: 1.2.1 > > Attachments: HIVE-10685.patch > > > "Orders" table has 15 rows and stored as ORC. > {noformat} > hive> select count(*) from orders; > OK > 15 > Time taken: 37.692 seconds, Fetched: 1 row(s) > {noformat} > The table contain 14 files,the size of each file is about 2.1 ~ 3.2 GB. > After executing command : ALTER TABLE orders CONCATENATE; > The table is already 1530115000 rows. > My hive version is 1.1.0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11027) Hive on tez: Bucket map joins fail when hashcode goes negative
[ https://issues.apache.org/jira/browse/HIVE-11027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-11027: - Fix Version/s: (was: 2.0.0) (was: 1.1.1) (was: 1.0.1) > Hive on tez: Bucket map joins fail when hashcode goes negative > -- > > Key: HIVE-11027 > URL: https://issues.apache.org/jira/browse/HIVE-11027 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: 0.14.0, 1.0.0, 0.13 >Reporter: Vikram Dixit K >Assignee: Prasanth Jayachandran > Fix For: 1.2.1 > > Attachments: HIVE-11027.patch > > > Seeing an issue when dynamic sort optimization is enabled while doing an > insert into bucketed table. We seem to be flipping the negative sign on the > hashcode instead of taking the complement of it for routing the data > correctly. This results in correctness issues in bucket map joins in hive on > tez when the hash code goes negative. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11060) Make test windowing.q robust
[ https://issues.apache.org/jira/browse/HIVE-11060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594085#comment-14594085 ] Ashutosh Chauhan commented on HIVE-11060: - You also need to update golden file for SparkCliDriver for this test. Otherwise looks good, +1 > Make test windowing.q robust > > > Key: HIVE-11060 > URL: https://issues.apache.org/jira/browse/HIVE-11060 > Project: Hive > Issue Type: Bug > Components: Tests >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-11060.patch > > > Add partition / order by in over clause to make result set deterministic. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11060) Make test windowing.q robust
[ https://issues.apache.org/jira/browse/HIVE-11060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-11060: --- Attachment: HIVE-11060.patch > Make test windowing.q robust > > > Key: HIVE-11060 > URL: https://issues.apache.org/jira/browse/HIVE-11060 > Project: Hive > Issue Type: Bug > Components: Tests >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-11060.patch > > > Add partition / order by in over clause to make result set deterministic. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11059) hcatalog-server-extensions tests scope should depend on hive-exec
[ https://issues.apache.org/jira/browse/HIVE-11059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594079#comment-14594079 ] Hari Sankar Sivarama Subramaniyan commented on HIVE-11059: -- +1 > hcatalog-server-extensions tests scope should depend on hive-exec > - > > Key: HIVE-11059 > URL: https://issues.apache.org/jira/browse/HIVE-11059 > Project: Hive > Issue Type: Bug > Components: Tests >Affects Versions: 1.2.1 >Reporter: Sushanth Sowmyan >Assignee: Sushanth Sowmyan >Priority: Minor > Attachments: HIVE-11059.patch > > > (causes test failures in Windows due to the lack of WindowsPathUtil being > available otherwise) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11044) Some optimizable predicates being missed by constant propagation
[ https://issues.apache.org/jira/browse/HIVE-11044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594078#comment-14594078 ] Ashutosh Chauhan commented on HIVE-11044: - +1 > Some optimizable predicates being missed by constant propagation > > > Key: HIVE-11044 > URL: https://issues.apache.org/jira/browse/HIVE-11044 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Jason Dere >Assignee: Jason Dere > Attachments: HIVE-11044.1.patch, HIVE-11044.2.patch > > > Some of the qfile explain plans show some predicates that could be taken care > of by running ConstantPropagate after the PartitionPruner: > index_auto_unused.q: > {noformat} > filterExpr: ((12.0 = 12.0) and (UDFToDouble(key) < 10.0)) (type: boolean) > {noformat} > join28.q: > {noformat} > predicate: ((11.0 = 11.0) and key is not null) (type: boolean) > {noformat} > bucketsort_optimize_insert_7.q ("is not null" is unnecessary) > {noformat} > predicate: (((key < 8) and key is not null) and ((key = 0) or (key = 5))) > (type: boolean) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10996) Aggregation / Projection over Multi-Join Inner Query producing incorrect results
[ https://issues.apache.org/jira/browse/HIVE-10996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-10996: --- Attachment: HIVE-10996.07.patch > Aggregation / Projection over Multi-Join Inner Query producing incorrect > results > > > Key: HIVE-10996 > URL: https://issues.apache.org/jira/browse/HIVE-10996 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.0.0, 1.2.0, 1.1.0, 1.3.0, 2.0.0 >Reporter: Gautam Kowshik >Assignee: Jesus Camacho Rodriguez >Priority: Critical > Attachments: HIVE-10996.01.patch, HIVE-10996.02.patch, > HIVE-10996.03.patch, HIVE-10996.04.patch, HIVE-10996.05.patch, > HIVE-10996.06.patch, HIVE-10996.07.patch, HIVE-10996.patch, explain_q1.txt, > explain_q2.txt > > > We see the following problem on 1.1.0 and 1.2.0 but not 0.13 which seems like > a regression. > The following query (Q1) produces no results: > {code} > select s > from ( > select last.*, action.st2, action.n > from ( > select purchase.s, purchase.timestamp, max (mevt.timestamp) as > last_stage_timestamp > from (select * from purchase_history) purchase > join (select * from cart_history) mevt > on purchase.s = mevt.s > where purchase.timestamp > mevt.timestamp > group by purchase.s, purchase.timestamp > ) last > join (select * from events) action > on last.s = action.s and last.last_stage_timestamp = action.timestamp > ) list; > {code} > While this one (Q2) does produce results : > {code} > select * > from ( > select last.*, action.st2, action.n > from ( > select purchase.s, purchase.timestamp, max (mevt.timestamp) as > last_stage_timestamp > from (select * from purchase_history) purchase > join (select * from cart_history) mevt > on purchase.s = mevt.s > where purchase.timestamp > mevt.timestamp > group by purchase.s, purchase.timestamp > ) last > join (select * from events) action > on last.s = action.s and last.last_stage_timestamp = action.timestamp > ) list; > 1 21 20 Bob 1234 > 1 31 30 Bob 1234 > 3 51 50 Jeff1234 > {code} > The setup to test this is: > {code} > create table purchase_history (s string, product string, price double, > timestamp int); > insert into purchase_history values ('1', 'Belt', 20.00, 21); > insert into purchase_history values ('1', 'Socks', 3.50, 31); > insert into purchase_history values ('3', 'Belt', 20.00, 51); > insert into purchase_history values ('4', 'Shirt', 15.50, 59); > create table cart_history (s string, cart_id int, timestamp int); > insert into cart_history values ('1', 1, 10); > insert into cart_history values ('1', 2, 20); > insert into cart_history values ('1', 3, 30); > insert into cart_history values ('1', 4, 40); > insert into cart_history values ('3', 5, 50); > insert into cart_history values ('4', 6, 60); > create table events (s string, st2 string, n int, timestamp int); > insert into events values ('1', 'Bob', 1234, 20); > insert into events values ('1', 'Bob', 1234, 30); > insert into events values ('1', 'Bob', 1234, 25); > insert into events values ('2', 'Sam', 1234, 30); > insert into events values ('3', 'Jeff', 1234, 50); > insert into events values ('4', 'Ted', 1234, 60); > {code} > I realize select * and select s are not all that interesting in this context > but what lead us to this issue was select count(distinct s) was not returning > results. The above queries are the simplified queries that produce the issue. > I will note that if I convert the inner join to a table and select from that > the issue does not appear. > Update: Found that turning off hive.optimize.remove.identity.project fixes > this issue. This optimization was introduced in > https://issues.apache.org/jira/browse/HIVE-8435 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11037) HiveOnTez: make explain user level = true as default
[ https://issues.apache.org/jira/browse/HIVE-11037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-11037: --- Attachment: HIVE-11037.04.patch address [~jpullokkaran]'s comments. > HiveOnTez: make explain user level = true as default > > > Key: HIVE-11037 > URL: https://issues.apache.org/jira/browse/HIVE-11037 > Project: Hive > Issue Type: Improvement >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-11037.01.patch, HIVE-11037.02.patch, > HIVE-11037.03.patch, HIVE-11037.04.patch > > > In Hive-9780, we introduced a new level of explain for hive on tez. We would > like to make it running by default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11044) Some optimizable predicates being missed by constant propagation
[ https://issues.apache.org/jira/browse/HIVE-11044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-11044: -- Attachment: HIVE-11044.2.patch patch v2 - updating golden for explainuser_2.q, now that the patch for HIVE-11028 has been committed > Some optimizable predicates being missed by constant propagation > > > Key: HIVE-11044 > URL: https://issues.apache.org/jira/browse/HIVE-11044 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Jason Dere >Assignee: Jason Dere > Attachments: HIVE-11044.1.patch, HIVE-11044.2.patch > > > Some of the qfile explain plans show some predicates that could be taken care > of by running ConstantPropagate after the PartitionPruner: > index_auto_unused.q: > {noformat} > filterExpr: ((12.0 = 12.0) and (UDFToDouble(key) < 10.0)) (type: boolean) > {noformat} > join28.q: > {noformat} > predicate: ((11.0 = 11.0) and key is not null) (type: boolean) > {noformat} > bucketsort_optimize_insert_7.q ("is not null" is unnecessary) > {noformat} > predicate: (((key < 8) and key is not null) and ((key = 0) or (key = 5))) > (type: boolean) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11059) hcatalog-server-extensions tests scope should depend on hive-exec
[ https://issues.apache.org/jira/browse/HIVE-11059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-11059: Attachment: HIVE-11059.patch Patch attached. [~hsubramaniyan], could you please review? > hcatalog-server-extensions tests scope should depend on hive-exec > - > > Key: HIVE-11059 > URL: https://issues.apache.org/jira/browse/HIVE-11059 > Project: Hive > Issue Type: Bug > Components: Tests >Affects Versions: 1.2.1 >Reporter: Sushanth Sowmyan >Assignee: Sushanth Sowmyan >Priority: Minor > Attachments: HIVE-11059.patch > > > (causes test failures in Windows due to the lack of WindowsPathUtil being > available otherwise) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11059) hcatalog-server-extensions tests scope should depend on hive-exec
[ https://issues.apache.org/jira/browse/HIVE-11059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-11059: Description: (causes test failures in Windows due to the lack of WindowsPathUtil being available otherwise) > hcatalog-server-extensions tests scope should depend on hive-exec > - > > Key: HIVE-11059 > URL: https://issues.apache.org/jira/browse/HIVE-11059 > Project: Hive > Issue Type: Bug > Components: Tests >Affects Versions: 1.2.1 >Reporter: Sushanth Sowmyan >Assignee: Sushanth Sowmyan >Priority: Minor > > (causes test failures in Windows due to the lack of WindowsPathUtil being > available otherwise) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10996) Aggregation / Projection over Multi-Join Inner Query producing incorrect results
[ https://issues.apache.org/jira/browse/HIVE-10996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594045#comment-14594045 ] Hive QA commented on HIVE-10996: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12740684/HIVE-10996.06.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9011 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join28 org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testTempTable {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4324/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4324/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4324/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12740684 - PreCommit-HIVE-TRUNK-Build > Aggregation / Projection over Multi-Join Inner Query producing incorrect > results > > > Key: HIVE-10996 > URL: https://issues.apache.org/jira/browse/HIVE-10996 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.0.0, 1.2.0, 1.1.0, 1.3.0, 2.0.0 >Reporter: Gautam Kowshik >Assignee: Jesus Camacho Rodriguez >Priority: Critical > Attachments: HIVE-10996.01.patch, HIVE-10996.02.patch, > HIVE-10996.03.patch, HIVE-10996.04.patch, HIVE-10996.05.patch, > HIVE-10996.06.patch, HIVE-10996.patch, explain_q1.txt, explain_q2.txt > > > We see the following problem on 1.1.0 and 1.2.0 but not 0.13 which seems like > a regression. > The following query (Q1) produces no results: > {code} > select s > from ( > select last.*, action.st2, action.n > from ( > select purchase.s, purchase.timestamp, max (mevt.timestamp) as > last_stage_timestamp > from (select * from purchase_history) purchase > join (select * from cart_history) mevt > on purchase.s = mevt.s > where purchase.timestamp > mevt.timestamp > group by purchase.s, purchase.timestamp > ) last > join (select * from events) action > on last.s = action.s and last.last_stage_timestamp = action.timestamp > ) list; > {code} > While this one (Q2) does produce results : > {code} > select * > from ( > select last.*, action.st2, action.n > from ( > select purchase.s, purchase.timestamp, max (mevt.timestamp) as > last_stage_timestamp > from (select * from purchase_history) purchase > join (select * from cart_history) mevt > on purchase.s = mevt.s > where purchase.timestamp > mevt.timestamp > group by purchase.s, purchase.timestamp > ) last > join (select * from events) action > on last.s = action.s and last.last_stage_timestamp = action.timestamp > ) list; > 1 21 20 Bob 1234 > 1 31 30 Bob 1234 > 3 51 50 Jeff1234 > {code} > The setup to test this is: > {code} > create table purchase_history (s string, product string, price double, > timestamp int); > insert into purchase_history values ('1', 'Belt', 20.00, 21); > insert into purchase_history values ('1', 'Socks', 3.50, 31); > insert into purchase_history values ('3', 'Belt', 20.00, 51); > insert into purchase_history values ('4', 'Shirt', 15.50, 59); > create table cart_history (s string, cart_id int, timestamp int); > insert into cart_history values ('1', 1, 10); > insert into cart_history values ('1', 2, 20); > insert into cart_history values ('1', 3, 30); > insert into cart_history values ('1', 4, 40); > insert into cart_history values ('3', 5, 50); > insert into cart_history values ('4', 6, 60); > create table events (s string, st2 string, n int, timestamp int); > insert into events values ('1', 'Bob', 1234, 20); > insert into events values ('1', 'Bob', 1234, 30); > insert into events values ('1', 'Bob', 1234, 25); > insert into events values ('2', 'Sam', 1234, 30); > insert into events values ('3', 'Jeff', 1234, 50); > insert into events values ('4', 'Ted', 1234, 60); > {code} > I realize select * and select s are not all that interesting in this context > but what lead us to this issue was select count(distinct s) was not returning > results. The above queries are the simplified queries that produce the issue. > I will note that if I convert the inner join to a
[jira] [Updated] (HIVE-11042) Need fix Utilities.replaceTaskId method
[ https://issues.apache.org/jira/browse/HIVE-11042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongzhi Chen updated HIVE-11042: Attachment: HIVE-11042.3.patch Attach patch3 to remove the extra line. > Need fix Utilities.replaceTaskId method > --- > > Key: HIVE-11042 > URL: https://issues.apache.org/jira/browse/HIVE-11042 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 1.2.0 >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen > Attachments: HIVE-11042.1.patch, HIVE-11042.2.patch, > HIVE-11042.3.patch > > > When I are looking at other bug, I found Utilities.replaceTaskId (String, > int) method is not right. > For example > Utilities.replaceTaskId"(ds%3D1)01", 5); > return 5 > It should return (ds%3D1)05 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11042) Need fix Utilities.replaceTaskId method
[ https://issues.apache.org/jira/browse/HIVE-11042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593971#comment-14593971 ] Chao Sun commented on HIVE-11042: - +1 Can you also remove the extra line before this method? > Need fix Utilities.replaceTaskId method > --- > > Key: HIVE-11042 > URL: https://issues.apache.org/jira/browse/HIVE-11042 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 1.2.0 >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen > Attachments: HIVE-11042.1.patch, HIVE-11042.2.patch > > > When I are looking at other bug, I found Utilities.replaceTaskId (String, > int) method is not right. > For example > Utilities.replaceTaskId"(ds%3D1)01", 5); > return 5 > It should return (ds%3D1)05 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11058) Make alter_merge* tests (ORC only) stable across different OSes
[ https://issues.apache.org/jira/browse/HIVE-11058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-11058: - Summary: Make alter_merge* tests (ORC only) stable across different OSes (was: Make alter_merge* tests stable across different OSes) > Make alter_merge* tests (ORC only) stable across different OSes > --- > > Key: HIVE-11058 > URL: https://issues.apache.org/jira/browse/HIVE-11058 > Project: Hive > Issue Type: Bug >Affects Versions: 1.2.0, 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > > alter_merge* (ORC only) tests are showing stats diff in different OSes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11058) Make alter_merge* tests stable across different OSes
[ https://issues.apache.org/jira/browse/HIVE-11058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-11058: - Description: alter_merge* (ORC only) tests are showing stats diff in different OSes. (was: alter_merge* tests are showing stats diff in different OSes.) > Make alter_merge* tests stable across different OSes > > > Key: HIVE-11058 > URL: https://issues.apache.org/jira/browse/HIVE-11058 > Project: Hive > Issue Type: Bug >Affects Versions: 1.2.0, 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > > alter_merge* (ORC only) tests are showing stats diff in different OSes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11042) Need fix Utilities.replaceTaskId method
[ https://issues.apache.org/jira/browse/HIVE-11042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593843#comment-14593843 ] Hive QA commented on HIVE-11042: {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12740664/HIVE-11042.2.patch {color:green}SUCCESS:{color} +1 9011 tests passed Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4323/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4323/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4323/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12740664 - PreCommit-HIVE-TRUNK-Build > Need fix Utilities.replaceTaskId method > --- > > Key: HIVE-11042 > URL: https://issues.apache.org/jira/browse/HIVE-11042 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 1.2.0 >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen > Attachments: HIVE-11042.1.patch, HIVE-11042.2.patch > > > When I are looking at other bug, I found Utilities.replaceTaskId (String, > int) method is not right. > For example > Utilities.replaceTaskId"(ds%3D1)01", 5); > return 5 > It should return (ds%3D1)05 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11042) Need fix Utilities.replaceTaskId method
[ https://issues.apache.org/jira/browse/HIVE-11042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593833#comment-14593833 ] Yongzhi Chen commented on HIVE-11042: - [~csun], I think replaceTaskId( string , int) is a right name, we may need fix other method names. I will submit a different jira and work on it once I am fully understand other methods' use case. I make it public because I could do some unit test on it. It should have no harm. We have a couple of replaceTask...method is public too. I do not know why my git diff always give me previous version of change, but after I create a new branch and cherry-pick my change, the issue is solved. Attach new version of patch 2. Thanks. > Need fix Utilities.replaceTaskId method > --- > > Key: HIVE-11042 > URL: https://issues.apache.org/jira/browse/HIVE-11042 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 1.2.0 >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen > Attachments: HIVE-11042.1.patch, HIVE-11042.2.patch > > > When I are looking at other bug, I found Utilities.replaceTaskId (String, > int) method is not right. > For example > Utilities.replaceTaskId"(ds%3D1)01", 5); > return 5 > It should return (ds%3D1)05 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11042) Need fix Utilities.replaceTaskId method
[ https://issues.apache.org/jira/browse/HIVE-11042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongzhi Chen updated HIVE-11042: Attachment: (was: HIVE-11042.2.patch) > Need fix Utilities.replaceTaskId method > --- > > Key: HIVE-11042 > URL: https://issues.apache.org/jira/browse/HIVE-11042 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 1.2.0 >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen > Attachments: HIVE-11042.1.patch, HIVE-11042.2.patch > > > When I are looking at other bug, I found Utilities.replaceTaskId (String, > int) method is not right. > For example > Utilities.replaceTaskId"(ds%3D1)01", 5); > return 5 > It should return (ds%3D1)05 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11042) Need fix Utilities.replaceTaskId method
[ https://issues.apache.org/jira/browse/HIVE-11042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongzhi Chen updated HIVE-11042: Attachment: HIVE-11042.2.patch > Need fix Utilities.replaceTaskId method > --- > > Key: HIVE-11042 > URL: https://issues.apache.org/jira/browse/HIVE-11042 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 1.2.0 >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen > Attachments: HIVE-11042.1.patch, HIVE-11042.2.patch > > > When I are looking at other bug, I found Utilities.replaceTaskId (String, > int) method is not right. > For example > Utilities.replaceTaskId"(ds%3D1)01", 5); > return 5 > It should return (ds%3D1)05 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11057) HBase metastore chokes on partition with ':' in name
[ https://issues.apache.org/jira/browse/HIVE-11057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated HIVE-11057: -- Attachment: HIVE-11057.patch This patch changes the separator from colon to ^A > HBase metastore chokes on partition with ':' in name > > > Key: HIVE-11057 > URL: https://issues.apache.org/jira/browse/HIVE-11057 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: hbase-metastore-branch >Reporter: Alan Gates >Assignee: Alan Gates > Fix For: hbase-metastore-branch > > Attachments: HIVE-11057.patch > > > The HBase metastore uses ':' as a key separator when building keys for the > partition table. This means that partitions with a colon in the name (which > is legal) cause problems. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10970) Revert HIVE-10453: HS2 leaking open file descriptors when using UDFs
[ https://issues.apache.org/jira/browse/HIVE-10970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593803#comment-14593803 ] Yongzhi Chen commented on HIVE-10970: - [~vgumashta], do you find a reproduce? If you have, could you share with me? HIVE-10453 has to be fixed. Thanks > Revert HIVE-10453: HS2 leaking open file descriptors when using UDFs > > > Key: HIVE-10970 > URL: https://issues.apache.org/jira/browse/HIVE-10970 > Project: Hive > Issue Type: Bug >Reporter: Vaibhav Gumashta >Assignee: Vaibhav Gumashta > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10594) Remote Spark client doesn't use Kerberos keytab to authenticate [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-10594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593772#comment-14593772 ] Hive QA commented on HIVE-10594: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12740689/HIVE-10594.1-spark.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 7987 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.initializationError org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_index_bitmap_auto {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/897/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/897/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-897/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12740689 - PreCommit-HIVE-SPARK-Build > Remote Spark client doesn't use Kerberos keytab to authenticate [Spark Branch] > -- > > Key: HIVE-10594 > URL: https://issues.apache.org/jira/browse/HIVE-10594 > Project: Hive > Issue Type: Sub-task > Components: Spark >Affects Versions: 1.1.0 >Reporter: Chao Sun >Assignee: Xuefu Zhang > Attachments: HIVE-10594.1-spark.patch > > > Reporting problem found by one of the HoS users: > Currently, if user is running Beeline on a different host than HS2, and > he/she didn't do kinit on the HS2 host, then he/she may get the following > error: > {code} > 2015-04-29 15:49:34,614 INFO org.apache.hive.spark.client.SparkClientImpl: > 15/04/29 15:49:34 WARN UserGroupInformation: PriviledgedActionException > as:hive (auth:KERBEROS) cause:java.io.IOException: > javax.security.sasl.SaslException: GSS initiate failed [Caused by > GSSException: No valid credentials provided (Mechanism level: Failed to find > any Kerberos tgt)] > 2015-04-29 15:49:34,652 INFO org.apache.hive.spark.client.SparkClientImpl: > Exception in thread "main" java.io.IOException: Failed on local exception: > java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed > [Caused by GSSException: No valid credentials provided (Mechanism level: > Failed to find any Kerberos tgt)]; Host Details : local host is: > "secure-hos-1.ent.cloudera.com/10.20.77.79"; destination host is: > "secure-hos-1.ent.cloudera.com":8032; > 2015-04-29 15:49:34,653 INFO org.apache.hive.spark.client.SparkClientImpl: > at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772) > 2015-04-29 15:49:34,653 INFO org.apache.hive.spark.client.SparkClientImpl: > at org.apache.hadoop.ipc.Client.call(Client.java:1472) > 2015-04-29 15:49:34,654 INFO org.apache.hive.spark.client.SparkClientImpl: > at org.apache.hadoop.ipc.Client.call(Client.java:1399) > 2015-04-29 15:49:34,654 INFO org.apache.hive.spark.client.SparkClientImpl: > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232) > 2015-04-29 15:49:34,654 INFO org.apache.hive.spark.client.SparkClientImpl: > at com.sun.proxy.$Proxy11.getClusterMetrics(Unknown Source) > 2015-04-29 15:49:34,655 INFO org.apache.hive.spark.client.SparkClientImpl: > at > org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterMetrics(ApplicationClientProtocolPBClientImpl.java:202) > 2015-04-29 15:49:34,655 INFO org.apache.hive.spark.client.SparkClientImpl: > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > 2015-04-29 15:49:34,655 INFO org.apache.hive.spark.client.SparkClientImpl: > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > 2015-04-29 15:49:34,656 INFO org.apache.hive.spark.client.SparkClientImpl: > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > 2015-04-29 15:49:34,656 INFO org.apache.hive.spark.client.SparkClientImpl: > at java.lang.reflect.Method.invoke(Method.java:606) > 2015-04-29 15:49:34,656 INFO org.apache.hive.spark.client.SparkClientImpl: > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) > 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl: > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) > 2015-04-29 15:49:34,657 I
[jira] [Commented] (HIVE-11048) Make test cbo_windowing robust
[ https://issues.apache.org/jira/browse/HIVE-11048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593771#comment-14593771 ] Jesus Camacho Rodriguez commented on HIVE-11048: LGTM, +1 > Make test cbo_windowing robust > -- > > Key: HIVE-11048 > URL: https://issues.apache.org/jira/browse/HIVE-11048 > Project: Hive > Issue Type: Test > Components: Tests >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-11048.patch > > > Add partition / order by in over clause to make result set deterministic. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-11034) Joining multiple tables producing different results with different order of join
[ https://issues.apache.org/jira/browse/HIVE-11034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang resolved HIVE-11034. Resolution: Duplicate I close this as a dupe. However, please feel free to reopen if the problem persists. > Joining multiple tables producing different results with different order of > join > > > Key: HIVE-11034 > URL: https://issues.apache.org/jira/browse/HIVE-11034 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.13.0 > Environment: Linux 2.6.32-279.19.1.el6.x86_64 >Reporter: Srini Pindi >Priority: Critical > > {panel} > Join between tables with different join columns from main table yielding > wrong results in hive. > Changing the order of the joins between main table and other tables is > producing different results. > {panel} > Please see below for the steps to reproduce the issue: > 1. Create tables as follows: > create table p(ck string, email string); > create table a1(ck string, flag string); > create table a2(email string, flag string); > create table a3(ck string, flag string); > 2. Load data into the tables as follows: > P > ||ck||email|| > |10|e10| > |20|e20| > |30|e30| > |40|e40| > > A1 > ||ck||flag|| > |10|N| > |20|Y| > |30|Y| > |40|Y| > A2 > ||email||flag|| > |e10|Y| > |e20|N| > |e30|Y| > |e40|Y| > > A3 > ||ck||flag|| > |10|Y| > |20|Y| > |30|N| > |40|Y| > > 3. Good query: > {panel} > select p.ck > from p > left outer join a1 on p.ck = a1.ck > left outer join a3 on p.ck = a3.ck > left outer join a2 on p.email = a2.email > where a1.flag = 'Y' > and a3.flag = 'Y' > and a2.flag = 'Y' > ; > {panel} > and results are > 40 > 4. Bad query > {panel} > select p.ck > from p > left outer join a1 on p.ck = a1.ck > left outer join a2 on p.email = a2.email > left outer join a3 on p.ck = a3.ck > where a1.flag = 'Y' > and a2.flag = 'Y' > and a3.flag = 'Y' > ; > {panel} > Producing results as: > 30 > 40 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10533) CBO (Calcite Return Path): Join to MultiJoin support for outer joins
[ https://issues.apache.org/jira/browse/HIVE-10533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593714#comment-14593714 ] Hive QA commented on HIVE-10533: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12740660/HIVE-10533.04.patch {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 9012 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join25 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mapjoin_hook org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join2 org.apache.hadoop.hive.ql.TestMTQueries.testMTQueries1 {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4322/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4322/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4322/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12740660 - PreCommit-HIVE-TRUNK-Build > CBO (Calcite Return Path): Join to MultiJoin support for outer joins > > > Key: HIVE-10533 > URL: https://issues.apache.org/jira/browse/HIVE-10533 > Project: Hive > Issue Type: Sub-task > Components: CBO >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-10533.01.patch, HIVE-10533.02.patch, > HIVE-10533.02.patch, HIVE-10533.03.patch, HIVE-10533.04.patch, > HIVE-10533.patch > > > CBO return path: auto_join7.q can be used to reproduce the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10972) DummyTxnManager always locks the current database in shared mode, which is incorrect.
[ https://issues.apache.org/jira/browse/HIVE-10972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593685#comment-14593685 ] Alan Gates commented on HIVE-10972: --- Yes, you're right. I see where it's getting the parent locks. +1 to committing this patch. > DummyTxnManager always locks the current database in shared mode, which is > incorrect. > - > > Key: HIVE-10972 > URL: https://issues.apache.org/jira/browse/HIVE-10972 > Project: Hive > Issue Type: Bug > Components: Locking >Affects Versions: 2.0.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-10972.2.patch, HIVE-10972.patch > > > In DummyTxnManager [line 163 | > http://grepcode.com/file/repo1.maven.org/maven2/co.cask.cdap/hive-exec/0.13.0/org/apache/hadoop/hive/ql/lockmgr/DummyTxnManager.java#163], > it always locks the current database. > That is not correct since the current database can be "db1", and the query > can be "select * from db2.tb1", which will lock db1 unnecessarily. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10479) CBO: Calcite Operator To Hive Operator (Calcite Return Path) Empty tabAlias in columnInfo which triggers PPD
[ https://issues.apache.org/jira/browse/HIVE-10479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593672#comment-14593672 ] Laljo John Pullokkaran commented on HIVE-10479: --- +1 > CBO: Calcite Operator To Hive Operator (Calcite Return Path) Empty tabAlias > in columnInfo which triggers PPD > > > Key: HIVE-10479 > URL: https://issues.apache.org/jira/browse/HIVE-10479 > Project: Hive > Issue Type: Sub-task > Components: CBO >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-10479.01.patch, HIVE-10479.02.patch, > HIVE-10479.03.patch, HIVE-10479.patch > > > in ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java, line 477, > when aliases contains empty string "" and key is an empty string "" too, it > assumes that aliases contains key. This will trigger incorrect PPD. To > reproduce it, apply the HIVE-10455 and run cbo_subq_notin.q. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10594) Remote Spark client doesn't use Kerberos keytab to authenticate [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-10594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-10594: --- Attachment: HIVE-10594.1-spark.patch > Remote Spark client doesn't use Kerberos keytab to authenticate [Spark Branch] > -- > > Key: HIVE-10594 > URL: https://issues.apache.org/jira/browse/HIVE-10594 > Project: Hive > Issue Type: Sub-task > Components: Spark >Affects Versions: 1.1.0 >Reporter: Chao Sun >Assignee: Xuefu Zhang > Attachments: HIVE-10594.1-spark.patch > > > Reporting problem found by one of the HoS users: > Currently, if user is running Beeline on a different host than HS2, and > he/she didn't do kinit on the HS2 host, then he/she may get the following > error: > {code} > 2015-04-29 15:49:34,614 INFO org.apache.hive.spark.client.SparkClientImpl: > 15/04/29 15:49:34 WARN UserGroupInformation: PriviledgedActionException > as:hive (auth:KERBEROS) cause:java.io.IOException: > javax.security.sasl.SaslException: GSS initiate failed [Caused by > GSSException: No valid credentials provided (Mechanism level: Failed to find > any Kerberos tgt)] > 2015-04-29 15:49:34,652 INFO org.apache.hive.spark.client.SparkClientImpl: > Exception in thread "main" java.io.IOException: Failed on local exception: > java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed > [Caused by GSSException: No valid credentials provided (Mechanism level: > Failed to find any Kerberos tgt)]; Host Details : local host is: > "secure-hos-1.ent.cloudera.com/10.20.77.79"; destination host is: > "secure-hos-1.ent.cloudera.com":8032; > 2015-04-29 15:49:34,653 INFO org.apache.hive.spark.client.SparkClientImpl: > at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772) > 2015-04-29 15:49:34,653 INFO org.apache.hive.spark.client.SparkClientImpl: > at org.apache.hadoop.ipc.Client.call(Client.java:1472) > 2015-04-29 15:49:34,654 INFO org.apache.hive.spark.client.SparkClientImpl: > at org.apache.hadoop.ipc.Client.call(Client.java:1399) > 2015-04-29 15:49:34,654 INFO org.apache.hive.spark.client.SparkClientImpl: > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232) > 2015-04-29 15:49:34,654 INFO org.apache.hive.spark.client.SparkClientImpl: > at com.sun.proxy.$Proxy11.getClusterMetrics(Unknown Source) > 2015-04-29 15:49:34,655 INFO org.apache.hive.spark.client.SparkClientImpl: > at > org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterMetrics(ApplicationClientProtocolPBClientImpl.java:202) > 2015-04-29 15:49:34,655 INFO org.apache.hive.spark.client.SparkClientImpl: > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > 2015-04-29 15:49:34,655 INFO org.apache.hive.spark.client.SparkClientImpl: > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > 2015-04-29 15:49:34,656 INFO org.apache.hive.spark.client.SparkClientImpl: > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > 2015-04-29 15:49:34,656 INFO org.apache.hive.spark.client.SparkClientImpl: > at java.lang.reflect.Method.invoke(Method.java:606) > 2015-04-29 15:49:34,656 INFO org.apache.hive.spark.client.SparkClientImpl: > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) > 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl: > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) > 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl: > at com.sun.proxy.$Proxy12.getClusterMetrics(Unknown Source) > 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl: > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getYarnClusterMetrics(YarnClientImpl.java:461) > 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl: > at > org.apache.spark.deploy.yarn.Client$$anonfun$submitApplication$1.apply(Client.scala:91) > 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl: > at > org.apache.spark.deploy.yarn.Client$$anonfun$submitApplication$1.apply(Client.scala:91) > 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl: > at org.apache.spark.Logging$class.logInfo(Logging.scala:59) > 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl: > at org.apache.spark.deploy.yarn.Client.logInfo(Client.scala:49) > 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl: > at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:90) > 2015-04-29 15:49:34,658 INFO org.ap
[jira] [Assigned] (HIVE-10594) Remote Spark client doesn't use Kerberos keytab to authenticate [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-10594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang reassigned HIVE-10594: -- Assignee: Xuefu Zhang > Remote Spark client doesn't use Kerberos keytab to authenticate [Spark Branch] > -- > > Key: HIVE-10594 > URL: https://issues.apache.org/jira/browse/HIVE-10594 > Project: Hive > Issue Type: Sub-task > Components: Spark >Affects Versions: 1.1.0 >Reporter: Chao Sun >Assignee: Xuefu Zhang > > Reporting problem found by one of the HoS users: > Currently, if user is running Beeline on a different host than HS2, and > he/she didn't do kinit on the HS2 host, then he/she may get the following > error: > {code} > 2015-04-29 15:49:34,614 INFO org.apache.hive.spark.client.SparkClientImpl: > 15/04/29 15:49:34 WARN UserGroupInformation: PriviledgedActionException > as:hive (auth:KERBEROS) cause:java.io.IOException: > javax.security.sasl.SaslException: GSS initiate failed [Caused by > GSSException: No valid credentials provided (Mechanism level: Failed to find > any Kerberos tgt)] > 2015-04-29 15:49:34,652 INFO org.apache.hive.spark.client.SparkClientImpl: > Exception in thread "main" java.io.IOException: Failed on local exception: > java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed > [Caused by GSSException: No valid credentials provided (Mechanism level: > Failed to find any Kerberos tgt)]; Host Details : local host is: > "secure-hos-1.ent.cloudera.com/10.20.77.79"; destination host is: > "secure-hos-1.ent.cloudera.com":8032; > 2015-04-29 15:49:34,653 INFO org.apache.hive.spark.client.SparkClientImpl: > at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772) > 2015-04-29 15:49:34,653 INFO org.apache.hive.spark.client.SparkClientImpl: > at org.apache.hadoop.ipc.Client.call(Client.java:1472) > 2015-04-29 15:49:34,654 INFO org.apache.hive.spark.client.SparkClientImpl: > at org.apache.hadoop.ipc.Client.call(Client.java:1399) > 2015-04-29 15:49:34,654 INFO org.apache.hive.spark.client.SparkClientImpl: > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232) > 2015-04-29 15:49:34,654 INFO org.apache.hive.spark.client.SparkClientImpl: > at com.sun.proxy.$Proxy11.getClusterMetrics(Unknown Source) > 2015-04-29 15:49:34,655 INFO org.apache.hive.spark.client.SparkClientImpl: > at > org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterMetrics(ApplicationClientProtocolPBClientImpl.java:202) > 2015-04-29 15:49:34,655 INFO org.apache.hive.spark.client.SparkClientImpl: > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > 2015-04-29 15:49:34,655 INFO org.apache.hive.spark.client.SparkClientImpl: > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > 2015-04-29 15:49:34,656 INFO org.apache.hive.spark.client.SparkClientImpl: > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > 2015-04-29 15:49:34,656 INFO org.apache.hive.spark.client.SparkClientImpl: > at java.lang.reflect.Method.invoke(Method.java:606) > 2015-04-29 15:49:34,656 INFO org.apache.hive.spark.client.SparkClientImpl: > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) > 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl: > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) > 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl: > at com.sun.proxy.$Proxy12.getClusterMetrics(Unknown Source) > 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl: > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getYarnClusterMetrics(YarnClientImpl.java:461) > 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl: > at > org.apache.spark.deploy.yarn.Client$$anonfun$submitApplication$1.apply(Client.scala:91) > 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl: > at > org.apache.spark.deploy.yarn.Client$$anonfun$submitApplication$1.apply(Client.scala:91) > 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl: > at org.apache.spark.Logging$class.logInfo(Logging.scala:59) > 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl: > at org.apache.spark.deploy.yarn.Client.logInfo(Client.scala:49) > 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl: > at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:90) > 2015-04-29 15:49:34,658 INFO org.apache.hive.spark.client.SparkClientImpl: > at org.apa
[jira] [Resolved] (HIVE-11000) Hive not able to pass Hive's Kerberos credential to spark-submit process [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-11000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang resolved HIVE-11000. Resolution: Duplicate > Hive not able to pass Hive's Kerberos credential to spark-submit process > [Spark Branch] > --- > > Key: HIVE-11000 > URL: https://issues.apache.org/jira/browse/HIVE-11000 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Xuefu Zhang > > The end of the result is that manual kinit with Hive's keytab on the host > where HS2 is running, or the following error may appear: > {code} > 2015-04-29 15:49:34,614 INFO org.apache.hive.spark.client.SparkClientImpl: > 15/04/29 15:49:34 WARN UserGroupInformation: PriviledgedActionException > as:hive (auth:KERBEROS) cause:java.io.IOException: > javax.security.sasl.SaslException: GSS initiate failed [Caused by > GSSException: No valid credentials provided (Mechanism level: Failed to find > any Kerberos tgt)] > 2015-04-29 15:49:34,652 INFO org.apache.hive.spark.client.SparkClientImpl: > Exception in thread "main" java.io.IOException: Failed on local exception: > java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed > [Caused by GSSException: No valid credentials provided (Mechanism level: > Failed to find any Kerberos tgt)]; Host Details : local host is: > "secure-hos-1.ent.cloudera.com/10.20.77.79"; destination host is: > "secure-hos-1.ent.cloudera.com":8032; > 2015-04-29 15:49:34,653 INFO org.apache.hive.spark.client.SparkClientImpl: > at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772) > 2015-04-29 15:49:34,653 INFO org.apache.hive.spark.client.SparkClientImpl: > at org.apache.hadoop.ipc.Client.call(Client.java:1472) > 2015-04-29 15:49:34,654 INFO org.apache.hive.spark.client.SparkClientImpl: > at org.apache.hadoop.ipc.Client.call(Client.java:1399) > 2015-04-29 15:49:34,654 INFO org.apache.hive.spark.client.SparkClientImpl: > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232) > 2015-04-29 15:49:34,654 INFO org.apache.hive.spark.client.SparkClientImpl: > at com.sun.proxy.$Proxy11.getClusterMetrics(Unknown Source) > 2015-04-29 15:49:34,655 INFO org.apache.hive.spark.client.SparkClientImpl: > at > org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterMetrics(ApplicationClientProtocolPBClientImpl.java:202) > 2015-04-29 15:49:34,655 INFO org.apache.hive.spark.client.SparkClientImpl: > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > 2015-04-29 15:49:34,655 INFO org.apache.hive.spark.client.SparkClientImpl: > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > 2015-04-29 15:49:34,656 INFO org.apache.hive.spark.client.SparkClientImpl: > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > 2015-04-29 15:49:34,656 INFO org.apache.hive.spark.client.SparkClientImpl: > at java.lang.reflect.Method.invoke(Method.java:606) > 2015-04-29 15:49:34,656 INFO org.apache.hive.spark.client.SparkClientImpl: > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) > 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl: > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) > 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl: > at com.sun.proxy.$Proxy12.getClusterMetrics(Unknown Source) > 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl: > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getYarnClusterMetrics(YarnClientImpl.java:461) > 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl: > at > org.apache.spark.deploy.yarn.Client$$anonfun$submitApplication$1.apply(Client.scala:91) > 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl: > at > org.apache.spark.deploy.yarn.Client$$anonfun$submitApplication$1.apply(Client.scala:91) > 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl: > at org.apache.spark.Logging$class.logInfo(Logging.scala:59) > 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl: > at org.apache.spark.deploy.yarn.Client.logInfo(Client.scala:49) > 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl: > at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:90) > 2015-04-29 15:49:34,658 INFO org.apache.hive.spark.client.SparkClientImpl: > at org.apache.spark.deploy.yarn.Client.run(Client.scala:619) > 2015-04-29 15:49:34,658 INFO org.apache.hive.spark.client.SparkClien
[jira] [Updated] (HIVE-10594) Remote Spark client doesn't use Kerberos keytab to authenticate [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-10594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-10594: --- Issue Type: Sub-task (was: Bug) Parent: HIVE-7292 > Remote Spark client doesn't use Kerberos keytab to authenticate [Spark Branch] > -- > > Key: HIVE-10594 > URL: https://issues.apache.org/jira/browse/HIVE-10594 > Project: Hive > Issue Type: Sub-task > Components: Spark >Affects Versions: 1.1.0 >Reporter: Chao Sun > > Reporting problem found by one of the HoS users: > Currently, if user is running Beeline on a different host than HS2, and > he/she didn't do kinit on the HS2 host, then he/she may get the following > error: > {code} > 2015-04-29 15:49:34,614 INFO org.apache.hive.spark.client.SparkClientImpl: > 15/04/29 15:49:34 WARN UserGroupInformation: PriviledgedActionException > as:hive (auth:KERBEROS) cause:java.io.IOException: > javax.security.sasl.SaslException: GSS initiate failed [Caused by > GSSException: No valid credentials provided (Mechanism level: Failed to find > any Kerberos tgt)] > 2015-04-29 15:49:34,652 INFO org.apache.hive.spark.client.SparkClientImpl: > Exception in thread "main" java.io.IOException: Failed on local exception: > java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed > [Caused by GSSException: No valid credentials provided (Mechanism level: > Failed to find any Kerberos tgt)]; Host Details : local host is: > "secure-hos-1.ent.cloudera.com/10.20.77.79"; destination host is: > "secure-hos-1.ent.cloudera.com":8032; > 2015-04-29 15:49:34,653 INFO org.apache.hive.spark.client.SparkClientImpl: > at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772) > 2015-04-29 15:49:34,653 INFO org.apache.hive.spark.client.SparkClientImpl: > at org.apache.hadoop.ipc.Client.call(Client.java:1472) > 2015-04-29 15:49:34,654 INFO org.apache.hive.spark.client.SparkClientImpl: > at org.apache.hadoop.ipc.Client.call(Client.java:1399) > 2015-04-29 15:49:34,654 INFO org.apache.hive.spark.client.SparkClientImpl: > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232) > 2015-04-29 15:49:34,654 INFO org.apache.hive.spark.client.SparkClientImpl: > at com.sun.proxy.$Proxy11.getClusterMetrics(Unknown Source) > 2015-04-29 15:49:34,655 INFO org.apache.hive.spark.client.SparkClientImpl: > at > org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterMetrics(ApplicationClientProtocolPBClientImpl.java:202) > 2015-04-29 15:49:34,655 INFO org.apache.hive.spark.client.SparkClientImpl: > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > 2015-04-29 15:49:34,655 INFO org.apache.hive.spark.client.SparkClientImpl: > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > 2015-04-29 15:49:34,656 INFO org.apache.hive.spark.client.SparkClientImpl: > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > 2015-04-29 15:49:34,656 INFO org.apache.hive.spark.client.SparkClientImpl: > at java.lang.reflect.Method.invoke(Method.java:606) > 2015-04-29 15:49:34,656 INFO org.apache.hive.spark.client.SparkClientImpl: > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) > 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl: > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) > 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl: > at com.sun.proxy.$Proxy12.getClusterMetrics(Unknown Source) > 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl: > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getYarnClusterMetrics(YarnClientImpl.java:461) > 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl: > at > org.apache.spark.deploy.yarn.Client$$anonfun$submitApplication$1.apply(Client.scala:91) > 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl: > at > org.apache.spark.deploy.yarn.Client$$anonfun$submitApplication$1.apply(Client.scala:91) > 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl: > at org.apache.spark.Logging$class.logInfo(Logging.scala:59) > 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl: > at org.apache.spark.deploy.yarn.Client.logInfo(Client.scala:49) > 2015-04-29 15:49:34,657 INFO org.apache.hive.spark.client.SparkClientImpl: > at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:90) > 2015-04-29 15:49:34,658 INFO org.apache.hive.spark.client.SparkClientImpl: > at org.apache.s
[jira] [Updated] (HIVE-11037) HiveOnTez: make explain user level = true as default
[ https://issues.apache.org/jira/browse/HIVE-11037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-11037: --- Attachment: HIVE-11037.03.patch > HiveOnTez: make explain user level = true as default > > > Key: HIVE-11037 > URL: https://issues.apache.org/jira/browse/HIVE-11037 > Project: Hive > Issue Type: Improvement >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-11037.01.patch, HIVE-11037.02.patch, > HIVE-11037.03.patch > > > In Hive-9780, we introduced a new level of explain for hive on tez. We would > like to make it running by default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10996) Aggregation / Projection over Multi-Join Inner Query producing incorrect results
[ https://issues.apache.org/jira/browse/HIVE-10996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-10996: --- Attachment: HIVE-10996.06.patch Updating q files. > Aggregation / Projection over Multi-Join Inner Query producing incorrect > results > > > Key: HIVE-10996 > URL: https://issues.apache.org/jira/browse/HIVE-10996 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.0.0, 1.2.0, 1.1.0, 1.3.0, 2.0.0 >Reporter: Gautam Kowshik >Assignee: Jesus Camacho Rodriguez >Priority: Critical > Attachments: HIVE-10996.01.patch, HIVE-10996.02.patch, > HIVE-10996.03.patch, HIVE-10996.04.patch, HIVE-10996.05.patch, > HIVE-10996.06.patch, HIVE-10996.patch, explain_q1.txt, explain_q2.txt > > > We see the following problem on 1.1.0 and 1.2.0 but not 0.13 which seems like > a regression. > The following query (Q1) produces no results: > {code} > select s > from ( > select last.*, action.st2, action.n > from ( > select purchase.s, purchase.timestamp, max (mevt.timestamp) as > last_stage_timestamp > from (select * from purchase_history) purchase > join (select * from cart_history) mevt > on purchase.s = mevt.s > where purchase.timestamp > mevt.timestamp > group by purchase.s, purchase.timestamp > ) last > join (select * from events) action > on last.s = action.s and last.last_stage_timestamp = action.timestamp > ) list; > {code} > While this one (Q2) does produce results : > {code} > select * > from ( > select last.*, action.st2, action.n > from ( > select purchase.s, purchase.timestamp, max (mevt.timestamp) as > last_stage_timestamp > from (select * from purchase_history) purchase > join (select * from cart_history) mevt > on purchase.s = mevt.s > where purchase.timestamp > mevt.timestamp > group by purchase.s, purchase.timestamp > ) last > join (select * from events) action > on last.s = action.s and last.last_stage_timestamp = action.timestamp > ) list; > 1 21 20 Bob 1234 > 1 31 30 Bob 1234 > 3 51 50 Jeff1234 > {code} > The setup to test this is: > {code} > create table purchase_history (s string, product string, price double, > timestamp int); > insert into purchase_history values ('1', 'Belt', 20.00, 21); > insert into purchase_history values ('1', 'Socks', 3.50, 31); > insert into purchase_history values ('3', 'Belt', 20.00, 51); > insert into purchase_history values ('4', 'Shirt', 15.50, 59); > create table cart_history (s string, cart_id int, timestamp int); > insert into cart_history values ('1', 1, 10); > insert into cart_history values ('1', 2, 20); > insert into cart_history values ('1', 3, 30); > insert into cart_history values ('1', 4, 40); > insert into cart_history values ('3', 5, 50); > insert into cart_history values ('4', 6, 60); > create table events (s string, st2 string, n int, timestamp int); > insert into events values ('1', 'Bob', 1234, 20); > insert into events values ('1', 'Bob', 1234, 30); > insert into events values ('1', 'Bob', 1234, 25); > insert into events values ('2', 'Sam', 1234, 30); > insert into events values ('3', 'Jeff', 1234, 50); > insert into events values ('4', 'Ted', 1234, 60); > {code} > I realize select * and select s are not all that interesting in this context > but what lead us to this issue was select count(distinct s) was not returning > results. The above queries are the simplified queries that produce the issue. > I will note that if I convert the inner join to a table and select from that > the issue does not appear. > Update: Found that turning off hive.optimize.remove.identity.project fixes > this issue. This optimization was introduced in > https://issues.apache.org/jira/browse/HIVE-8435 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11042) Need fix Utilities.replaceTaskId method
[ https://issues.apache.org/jira/browse/HIVE-11042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593597#comment-14593597 ] Chao Sun commented on HIVE-11042: - But it is confusing to have to replaceTaskId methods (although slightly different parameters), and doing very different things. Maybe rename them? Also, why this method is public instead of private? The comments for Patch #2 is still not changed. > Need fix Utilities.replaceTaskId method > --- > > Key: HIVE-11042 > URL: https://issues.apache.org/jira/browse/HIVE-11042 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 1.2.0 >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen > Attachments: HIVE-11042.1.patch, HIVE-11042.2.patch > > > When I are looking at other bug, I found Utilities.replaceTaskId (String, > int) method is not right. > For example > Utilities.replaceTaskId"(ds%3D1)01", 5); > return 5 > It should return (ds%3D1)05 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10996) Aggregation / Projection over Multi-Join Inner Query producing incorrect results
[ https://issues.apache.org/jira/browse/HIVE-10996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593595#comment-14593595 ] Hive QA commented on HIVE-10996: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12740656/HIVE-10996.05.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9011 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_having org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join28 {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4321/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4321/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4321/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12740656 - PreCommit-HIVE-TRUNK-Build > Aggregation / Projection over Multi-Join Inner Query producing incorrect > results > > > Key: HIVE-10996 > URL: https://issues.apache.org/jira/browse/HIVE-10996 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.0.0, 1.2.0, 1.1.0, 1.3.0, 2.0.0 >Reporter: Gautam Kowshik >Assignee: Jesus Camacho Rodriguez >Priority: Critical > Attachments: HIVE-10996.01.patch, HIVE-10996.02.patch, > HIVE-10996.03.patch, HIVE-10996.04.patch, HIVE-10996.05.patch, > HIVE-10996.patch, explain_q1.txt, explain_q2.txt > > > We see the following problem on 1.1.0 and 1.2.0 but not 0.13 which seems like > a regression. > The following query (Q1) produces no results: > {code} > select s > from ( > select last.*, action.st2, action.n > from ( > select purchase.s, purchase.timestamp, max (mevt.timestamp) as > last_stage_timestamp > from (select * from purchase_history) purchase > join (select * from cart_history) mevt > on purchase.s = mevt.s > where purchase.timestamp > mevt.timestamp > group by purchase.s, purchase.timestamp > ) last > join (select * from events) action > on last.s = action.s and last.last_stage_timestamp = action.timestamp > ) list; > {code} > While this one (Q2) does produce results : > {code} > select * > from ( > select last.*, action.st2, action.n > from ( > select purchase.s, purchase.timestamp, max (mevt.timestamp) as > last_stage_timestamp > from (select * from purchase_history) purchase > join (select * from cart_history) mevt > on purchase.s = mevt.s > where purchase.timestamp > mevt.timestamp > group by purchase.s, purchase.timestamp > ) last > join (select * from events) action > on last.s = action.s and last.last_stage_timestamp = action.timestamp > ) list; > 1 21 20 Bob 1234 > 1 31 30 Bob 1234 > 3 51 50 Jeff1234 > {code} > The setup to test this is: > {code} > create table purchase_history (s string, product string, price double, > timestamp int); > insert into purchase_history values ('1', 'Belt', 20.00, 21); > insert into purchase_history values ('1', 'Socks', 3.50, 31); > insert into purchase_history values ('3', 'Belt', 20.00, 51); > insert into purchase_history values ('4', 'Shirt', 15.50, 59); > create table cart_history (s string, cart_id int, timestamp int); > insert into cart_history values ('1', 1, 10); > insert into cart_history values ('1', 2, 20); > insert into cart_history values ('1', 3, 30); > insert into cart_history values ('1', 4, 40); > insert into cart_history values ('3', 5, 50); > insert into cart_history values ('4', 6, 60); > create table events (s string, st2 string, n int, timestamp int); > insert into events values ('1', 'Bob', 1234, 20); > insert into events values ('1', 'Bob', 1234, 30); > insert into events values ('1', 'Bob', 1234, 25); > insert into events values ('2', 'Sam', 1234, 30); > insert into events values ('3', 'Jeff', 1234, 50); > insert into events values ('4', 'Ted', 1234, 60); > {code} > I realize select * and select s are not all that interesting in this context > but what lead us to this issue was select count(distinct s) was not returning > results. The above queries are the simplified queries that produce the issue. > I will note that if I convert the inner join to a table and select from that > the issue does not appear. > Update: Found that turning off hive.
[jira] [Commented] (HIVE-11043) ORC split strategies should adapt based on number of files
[ https://issues.apache.org/jira/browse/HIVE-11043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593575#comment-14593575 ] Prasanth Jayachandran commented on HIVE-11043: -- Mostly looks good. Few questions/comments: 1) Can we use the same default for numSplits as MR? 1 instead of -1. This will make ETL strategy the default even in the presence of single small file. {code} return generateSplitsInfo(conf, -1); {code} 2) The condition should be numFiles <= context.minSplits right? This will avoid choosing BI in the case of 1 small file. 3) I tried some queries and numSplits arg in getSplits() can become 0. In which case we will end up using BI as default even though there are only small number of files. 4) Some more tests for these corner cases will be helpful. 5) Should we make this independently configurable? Instead of using the cache max size. > ORC split strategies should adapt based on number of files > -- > > Key: HIVE-11043 > URL: https://issues.apache.org/jira/browse/HIVE-11043 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Gopal V > Fix For: 2.0.0 > > Attachments: HIVE-11043.1.patch > > > ORC split strategies added in HIVE-10114 chose strategies based on average > file size. It would be beneficial to choose a different strategy based on > number of files as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10999) Upgrade Spark dependency to 1.4 [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-10999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593540#comment-14593540 ] Hive QA commented on HIVE-10999: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12740653/HIVE-10999.2-spark.patch {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 8017 tests executed *Failed tests:* {noformat} TestCliDriver-interval_udf.q-metadataonly1.q-union13.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.initializationError org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_lateral_view_explode2 org.apache.hive.jdbc.TestSSL.testSSLConnectionWithProperty org.apache.hive.spark.client.TestSparkClient.testMetricsCollection {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/896/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/896/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-896/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12740653 - PreCommit-HIVE-SPARK-Build > Upgrade Spark dependency to 1.4 [Spark Branch] > -- > > Key: HIVE-10999 > URL: https://issues.apache.org/jira/browse/HIVE-10999 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Xuefu Zhang >Assignee: Rui Li > Attachments: HIVE-10999.1-spark.patch, HIVE-10999.2-spark.patch, > HIVE-10999.2-spark.patch > > > Spark 1.4.0 is release. Let's update the dependency version from 1.3.1 to > 1.4.0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11042) Need fix Utilities.replaceTaskId method
[ https://issues.apache.org/jira/browse/HIVE-11042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593521#comment-14593521 ] Yongzhi Chen commented on HIVE-11042: - PATCH 2 is attached. Please review. > Need fix Utilities.replaceTaskId method > --- > > Key: HIVE-11042 > URL: https://issues.apache.org/jira/browse/HIVE-11042 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 1.2.0 >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen > Attachments: HIVE-11042.1.patch, HIVE-11042.2.patch > > > When I are looking at other bug, I found Utilities.replaceTaskId (String, > int) method is not right. > For example > Utilities.replaceTaskId"(ds%3D1)01", 5); > return 5 > It should return (ds%3D1)05 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11042) Need fix Utilities.replaceTaskId method
[ https://issues.apache.org/jira/browse/HIVE-11042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongzhi Chen updated HIVE-11042: Attachment: HIVE-11042.2.patch > Need fix Utilities.replaceTaskId method > --- > > Key: HIVE-11042 > URL: https://issues.apache.org/jira/browse/HIVE-11042 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 1.2.0 >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen > Attachments: HIVE-11042.1.patch, HIVE-11042.2.patch > > > When I are looking at other bug, I found Utilities.replaceTaskId (String, > int) method is not right. > For example > Utilities.replaceTaskId"(ds%3D1)01", 5); > return 5 > It should return (ds%3D1)05 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11042) Need fix Utilities.replaceTaskId method
[ https://issues.apache.org/jira/browse/HIVE-11042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593492#comment-14593492 ] Yongzhi Chen commented on HIVE-11042: - [~csun], it will cause more confusing if I change the existing replaceTaskId(String, String), the major part is different and can not combined. In replaceTaskId(String param1, String param2) use case is pattern in param2, get pattern from param2, get lengh from param1 and replace param2. param2 called bucket number but it is major for bucket file(I think). replaceTaskId(String param1, int param2) Is get pattern and length from from param1 and use param2 number . It is not just switch order of the two params. I will change the comment and resubmit the patch. Thanks > Need fix Utilities.replaceTaskId method > --- > > Key: HIVE-11042 > URL: https://issues.apache.org/jira/browse/HIVE-11042 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 1.2.0 >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen > Attachments: HIVE-11042.1.patch > > > When I are looking at other bug, I found Utilities.replaceTaskId (String, > int) method is not right. > For example > Utilities.replaceTaskId"(ds%3D1)01", 5); > return 5 > It should return (ds%3D1)05 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10533) CBO (Calcite Return Path): Join to MultiJoin support for outer joins
[ https://issues.apache.org/jira/browse/HIVE-10533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-10533: --- Attachment: HIVE-10533.04.patch > CBO (Calcite Return Path): Join to MultiJoin support for outer joins > > > Key: HIVE-10533 > URL: https://issues.apache.org/jira/browse/HIVE-10533 > Project: Hive > Issue Type: Sub-task > Components: CBO >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-10533.01.patch, HIVE-10533.02.patch, > HIVE-10533.02.patch, HIVE-10533.03.patch, HIVE-10533.04.patch, > HIVE-10533.patch > > > CBO return path: auto_join7.q can be used to reproduce the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10996) Aggregation / Projection over Multi-Join Inner Query producing incorrect results
[ https://issues.apache.org/jira/browse/HIVE-10996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-10996: --- Attachment: HIVE-10996.05.patch > Aggregation / Projection over Multi-Join Inner Query producing incorrect > results > > > Key: HIVE-10996 > URL: https://issues.apache.org/jira/browse/HIVE-10996 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.0.0, 1.2.0, 1.1.0, 1.3.0, 2.0.0 >Reporter: Gautam Kowshik >Assignee: Jesus Camacho Rodriguez >Priority: Critical > Attachments: HIVE-10996.01.patch, HIVE-10996.02.patch, > HIVE-10996.03.patch, HIVE-10996.04.patch, HIVE-10996.05.patch, > HIVE-10996.patch, explain_q1.txt, explain_q2.txt > > > We see the following problem on 1.1.0 and 1.2.0 but not 0.13 which seems like > a regression. > The following query (Q1) produces no results: > {code} > select s > from ( > select last.*, action.st2, action.n > from ( > select purchase.s, purchase.timestamp, max (mevt.timestamp) as > last_stage_timestamp > from (select * from purchase_history) purchase > join (select * from cart_history) mevt > on purchase.s = mevt.s > where purchase.timestamp > mevt.timestamp > group by purchase.s, purchase.timestamp > ) last > join (select * from events) action > on last.s = action.s and last.last_stage_timestamp = action.timestamp > ) list; > {code} > While this one (Q2) does produce results : > {code} > select * > from ( > select last.*, action.st2, action.n > from ( > select purchase.s, purchase.timestamp, max (mevt.timestamp) as > last_stage_timestamp > from (select * from purchase_history) purchase > join (select * from cart_history) mevt > on purchase.s = mevt.s > where purchase.timestamp > mevt.timestamp > group by purchase.s, purchase.timestamp > ) last > join (select * from events) action > on last.s = action.s and last.last_stage_timestamp = action.timestamp > ) list; > 1 21 20 Bob 1234 > 1 31 30 Bob 1234 > 3 51 50 Jeff1234 > {code} > The setup to test this is: > {code} > create table purchase_history (s string, product string, price double, > timestamp int); > insert into purchase_history values ('1', 'Belt', 20.00, 21); > insert into purchase_history values ('1', 'Socks', 3.50, 31); > insert into purchase_history values ('3', 'Belt', 20.00, 51); > insert into purchase_history values ('4', 'Shirt', 15.50, 59); > create table cart_history (s string, cart_id int, timestamp int); > insert into cart_history values ('1', 1, 10); > insert into cart_history values ('1', 2, 20); > insert into cart_history values ('1', 3, 30); > insert into cart_history values ('1', 4, 40); > insert into cart_history values ('3', 5, 50); > insert into cart_history values ('4', 6, 60); > create table events (s string, st2 string, n int, timestamp int); > insert into events values ('1', 'Bob', 1234, 20); > insert into events values ('1', 'Bob', 1234, 30); > insert into events values ('1', 'Bob', 1234, 25); > insert into events values ('2', 'Sam', 1234, 30); > insert into events values ('3', 'Jeff', 1234, 50); > insert into events values ('4', 'Ted', 1234, 60); > {code} > I realize select * and select s are not all that interesting in this context > but what lead us to this issue was select count(distinct s) was not returning > results. The above queries are the simplified queries that produce the issue. > I will note that if I convert the inner join to a table and select from that > the issue does not appear. > Update: Found that turning off hive.optimize.remove.identity.project fixes > this issue. This optimization was introduced in > https://issues.apache.org/jira/browse/HIVE-8435 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10999) Upgrade Spark dependency to 1.4 [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-10999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-10999: -- Attachment: HIVE-10999.2-spark.patch Can't reproduce the failures locally. Try again. > Upgrade Spark dependency to 1.4 [Spark Branch] > -- > > Key: HIVE-10999 > URL: https://issues.apache.org/jira/browse/HIVE-10999 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Xuefu Zhang >Assignee: Rui Li > Attachments: HIVE-10999.1-spark.patch, HIVE-10999.2-spark.patch, > HIVE-10999.2-spark.patch > > > Spark 1.4.0 is release. Let's update the dependency version from 1.3.1 to > 1.4.0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10996) Aggregation / Projection over Multi-Join Inner Query producing incorrect results
[ https://issues.apache.org/jira/browse/HIVE-10996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593410#comment-14593410 ] Hive QA commented on HIVE-10996: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12740625/HIVE-10996.04.patch {color:red}ERROR:{color} -1 due to 125 failed/errored test(s), 9011 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join18 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join18_multi_distinct org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join32 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_groupby org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_gby org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_limit org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_semijoin org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_udf_udaf org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_views org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer8 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_count org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_distinct_stats org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_fold_eq_with_case_when org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby2_map org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby2_map_multi_distinct org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby2_map_skew org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby3_map org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby3_map_multi_distinct org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby3_map_skew org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby8_map_skew org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_cube1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_distinct_samekey org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_duplicate_key org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_grouping_id2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_map_ppr org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_map_ppr_multi_distinct org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_multi_insert_common_distinct org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_multi_single_reducer org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_position org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_rollup1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_11 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_8 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_infer_bucket_sort org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join18 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join18_multi_distinct org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_lateral_view org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_limit_pushdown org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mapjoin_mapjoin org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_metadataOnlyOptimizer org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_metadataonly1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_multiMapJoin1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_multi_insert_gby3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_multi_insert_lateral_view org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nullgroup4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nullgroup4_multi_distinct org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_optimize_nullscan org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_reduce_deduplicate_extended org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_semijoin org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_unqual_corr_expr org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_table_access_keys_stats org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_truncate_column_merge org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_count org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union17 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_unionDistinct_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_count_distinct org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_groupby_reduce org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_join30 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_leftsemi_mapjoin org.apache.h
[jira] [Commented] (HIVE-10999) Upgrade Spark dependency to 1.4 [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-10999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593406#comment-14593406 ] Hive QA commented on HIVE-10999: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12740629/HIVE-10999.2-spark.patch {color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 7972 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.initializationError org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby_bigdata org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_lateral_view_explode2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_leftsemijoin org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_mapjoin1 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_ppd_join2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_skewjoin_noskew org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_skewjoinopt10 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_skewjoinopt2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union6 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_8 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorization_pushdown {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/895/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/895/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-895/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 12 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12740629 - PreCommit-HIVE-SPARK-Build > Upgrade Spark dependency to 1.4 [Spark Branch] > -- > > Key: HIVE-10999 > URL: https://issues.apache.org/jira/browse/HIVE-10999 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Xuefu Zhang >Assignee: Rui Li > Attachments: HIVE-10999.1-spark.patch, HIVE-10999.2-spark.patch > > > Spark 1.4.0 is release. Let's update the dependency version from 1.3.1 to > 1.4.0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11054) Read error : Partition Varchar column cannot be cast to string
[ https://issues.apache.org/jira/browse/HIVE-11054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593402#comment-14593402 ] Xuefu Zhang commented on HIVE-11054: I think this might be fixed in latest Hive already. [~ctang.ma], any comments? > Read error : Partition Varchar column cannot be cast to string > -- > > Key: HIVE-11054 > URL: https://issues.apache.org/jira/browse/HIVE-11054 > Project: Hive > Issue Type: Bug > Components: Database/Schema >Affects Versions: 0.14.0 >Reporter: Devansh Srivastava > > Hi, > I have one table with VARCHAR and CHAR datatypes.My target table has > structure like this :-- > CREATE EXTERNAL TABLE test_table( > dob string COMMENT '', > version_nbr int COMMENT '', > record_status string COMMENT '', > creation_timestamp timestamp COMMENT '') > PARTITIONED BY ( > src_sys_cd varchar(10) COMMENT '',batch_id string COMMENT '') > ROW FORMAT DELIMITED > FIELDS TERMINATED BY '|' > STORED AS ORC > LOCATION > '/test/test_table'; > My source table has structure like below :-- > CREATE EXTERNAL TABLE test_staging_table( > dob string COMMENT '', > version_nbr int COMMENT '', > record_status string COMMENT '', > creation_timestamp timestamp COMMENT '' > src_sys_cd varchar(10) COMMENT '', > batch_id string COMMENT '') > ROW FORMAT DELIMITED > FIELDS TERMINATED BY '|' > STORED AS ORC > LOCATION > '/test/test_staging_table'; > We were loading data using pig script. Its a direct load, no transformation > needed. But when i was checking test_table's data in hive. It is giving > belowmentioned error: > Diagnostic Messages for this Task: > Error: java.io.IOException: java.io.IOException: java.lang.RuntimeException: > java.lang.ClassCastException: org.apache.hadoop.hive.common.type.HiveVarchar > cannot be cast to java.lang.String > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121) > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:273) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:183) > at > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199) > at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) > Caused by: java.io.IOException: java.lang.RuntimeException: > java.lang.ClassCastException: org.apache.hadoop.hive.common.type.HiveVarchar > cannot be cast to java.lang.String > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121) > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:352) > at > org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:101) > at > org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:41) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:115) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:271) > ... 11 more > Caused by: java.lang.RuntimeException: java.lang.ClassCastException: > org.apache.hadoop.hive.common.type.HiveVarchar cannot be cast to > java.lang.String > at > org.apache.hadoop.hive.ql.io.orc.VectorizedOrcInputFormat$VectorizedOrcRecordReader.next(VectorizedOrcInputFormat.java:95) > at > org.apache.hadoop.hive.ql.io.orc.VectorizedOrcInputFormat$VectorizedOrcRecordReader.next(VectorizedOrcInputFormat.java:49) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:347) > ... 15 more > Caused by: java.lang.ClassCastException: > org.apache.hadoop.hive.common.type.HiveVarchar cannot be cast to > java.lang.String > at > org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatchCtx.addPartitionColsToBatch(V
[jira] [Commented] (HIVE-10972) DummyTxnManager always locks the current database in shared mode, which is incorrect.
[ https://issues.apache.org/jira/browse/HIVE-10972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593401#comment-14593401 ] Aihua Xu commented on HIVE-10972: - [~alangates] Ping. +[~ashutoshc] as well since seems you also have knowledge on that front. > DummyTxnManager always locks the current database in shared mode, which is > incorrect. > - > > Key: HIVE-10972 > URL: https://issues.apache.org/jira/browse/HIVE-10972 > Project: Hive > Issue Type: Bug > Components: Locking >Affects Versions: 2.0.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-10972.2.patch, HIVE-10972.patch > > > In DummyTxnManager [line 163 | > http://grepcode.com/file/repo1.maven.org/maven2/co.cask.cdap/hive-exec/0.13.0/org/apache/hadoop/hive/ql/lockmgr/DummyTxnManager.java#163], > it always locks the current database. > That is not correct since the current database can be "db1", and the query > can be "select * from db2.tb1", which will lock db1 unnecessarily. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10999) Upgrade Spark dependency to 1.4 [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-10999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-10999: -- Attachment: HIVE-10999.2-spark.patch Talked about this with Chengxiang. We think the reason is that spark doesn't depends on {{jersey-servlet}}. So during tests, we only add {{jersey-server}} to classpath, at version 1.14 (although I don't know why maven doesn't get the 1.9 version, which is what spark really depends on). Then we have the class not found error. There won't be a problem during runtime because we'll have spark-assembly in classpath. That's why the qtests passed. To solve the issue, we can either explicitly add the {{jersey-servlet}} dependency to failed tests, or we can upgrade hive's jersey version to 1.9 and remove dependency on {{jersey-servlet}} (it doesn't exist in 1.9). Patch v2 takes the 1st approach which is simpler. But I think 2nd approach may be better, if possible. > Upgrade Spark dependency to 1.4 [Spark Branch] > -- > > Key: HIVE-10999 > URL: https://issues.apache.org/jira/browse/HIVE-10999 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Xuefu Zhang >Assignee: Rui Li > Attachments: HIVE-10999.1-spark.patch, HIVE-10999.2-spark.patch > > > Spark 1.4.0 is release. Let's update the dependency version from 1.3.1 to > 1.4.0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10996) Aggregation / Projection over Multi-Join Inner Query producing incorrect results
[ https://issues.apache.org/jira/browse/HIVE-10996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-10996: --- Attachment: HIVE-10996.04.patch > Aggregation / Projection over Multi-Join Inner Query producing incorrect > results > > > Key: HIVE-10996 > URL: https://issues.apache.org/jira/browse/HIVE-10996 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.0.0, 1.2.0, 1.1.0, 1.3.0, 2.0.0 >Reporter: Gautam Kowshik >Assignee: Jesus Camacho Rodriguez >Priority: Critical > Attachments: HIVE-10996.01.patch, HIVE-10996.02.patch, > HIVE-10996.03.patch, HIVE-10996.04.patch, HIVE-10996.patch, explain_q1.txt, > explain_q2.txt > > > We see the following problem on 1.1.0 and 1.2.0 but not 0.13 which seems like > a regression. > The following query (Q1) produces no results: > {code} > select s > from ( > select last.*, action.st2, action.n > from ( > select purchase.s, purchase.timestamp, max (mevt.timestamp) as > last_stage_timestamp > from (select * from purchase_history) purchase > join (select * from cart_history) mevt > on purchase.s = mevt.s > where purchase.timestamp > mevt.timestamp > group by purchase.s, purchase.timestamp > ) last > join (select * from events) action > on last.s = action.s and last.last_stage_timestamp = action.timestamp > ) list; > {code} > While this one (Q2) does produce results : > {code} > select * > from ( > select last.*, action.st2, action.n > from ( > select purchase.s, purchase.timestamp, max (mevt.timestamp) as > last_stage_timestamp > from (select * from purchase_history) purchase > join (select * from cart_history) mevt > on purchase.s = mevt.s > where purchase.timestamp > mevt.timestamp > group by purchase.s, purchase.timestamp > ) last > join (select * from events) action > on last.s = action.s and last.last_stage_timestamp = action.timestamp > ) list; > 1 21 20 Bob 1234 > 1 31 30 Bob 1234 > 3 51 50 Jeff1234 > {code} > The setup to test this is: > {code} > create table purchase_history (s string, product string, price double, > timestamp int); > insert into purchase_history values ('1', 'Belt', 20.00, 21); > insert into purchase_history values ('1', 'Socks', 3.50, 31); > insert into purchase_history values ('3', 'Belt', 20.00, 51); > insert into purchase_history values ('4', 'Shirt', 15.50, 59); > create table cart_history (s string, cart_id int, timestamp int); > insert into cart_history values ('1', 1, 10); > insert into cart_history values ('1', 2, 20); > insert into cart_history values ('1', 3, 30); > insert into cart_history values ('1', 4, 40); > insert into cart_history values ('3', 5, 50); > insert into cart_history values ('4', 6, 60); > create table events (s string, st2 string, n int, timestamp int); > insert into events values ('1', 'Bob', 1234, 20); > insert into events values ('1', 'Bob', 1234, 30); > insert into events values ('1', 'Bob', 1234, 25); > insert into events values ('2', 'Sam', 1234, 30); > insert into events values ('3', 'Jeff', 1234, 50); > insert into events values ('4', 'Ted', 1234, 60); > {code} > I realize select * and select s are not all that interesting in this context > but what lead us to this issue was select count(distinct s) was not returning > results. The above queries are the simplified queries that produce the issue. > I will note that if I convert the inner join to a table and select from that > the issue does not appear. > Update: Found that turning off hive.optimize.remove.identity.project fixes > this issue. This optimization was introduced in > https://issues.apache.org/jira/browse/HIVE-8435 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-7193) Hive should support additional LDAP authentication parameters
[ https://issues.apache.org/jira/browse/HIVE-7193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593269#comment-14593269 ] Hive QA commented on HIVE-7193: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12740574/HIVE-7193.5.patch {color:green}SUCCESS:{color} +1 9010 tests passed Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4319/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4319/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4319/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12740574 - PreCommit-HIVE-TRUNK-Build > Hive should support additional LDAP authentication parameters > - > > Key: HIVE-7193 > URL: https://issues.apache.org/jira/browse/HIVE-7193 > Project: Hive > Issue Type: Bug >Affects Versions: 0.10.0 >Reporter: Mala Chikka Kempanna >Assignee: Naveen Gangam > Attachments: HIVE-7193.2.patch, HIVE-7193.3.patch, HIVE-7193.4.patch, > HIVE-7193.5.patch, HIVE-7193.patch, LDAPAuthentication_Design_Doc.docx, > LDAPAuthentication_Design_Doc_V2.docx > > > Currently hive has only following authenticator parameters for LDAP > authentication for hiveserver2: > {code:xml} > > hive.server2.authentication > LDAP > > > hive.server2.authentication.ldap.url > ldap://our_ldap_address > > {code} > We need to include other LDAP properties as part of hive-LDAP authentication > like below: > {noformat} > a group search base -> dc=domain,dc=com > a group search filter -> member={0} > a user search base -> dc=domain,dc=com > a user search filter -> sAMAAccountName={0} > a list of valid user groups -> group1,group2,group3 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10996) Aggregation / Projection over Multi-Join Inner Query producing incorrect results
[ https://issues.apache.org/jira/browse/HIVE-10996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593260#comment-14593260 ] Jesus Camacho Rodriguez commented on HIVE-10996: [~jpullokkaran], thanks for the comments. 1) you are right, the solution should not be tailored towards FIL operator only, as this could happen in other cases; I will change the patch accordingly, 2) the error is produced because the SEL operator above the FIL operator is removed by IdentityProjectRemoval optimization (as SEL and FIL have the same schema, this is what IdentityProjectRemoval is supposed to do). But actually that SEL operator was pruning columns out of the input tuples. In the specific example provided in this case, after the SEL operator there is a JOIN operator, that was then joining on wrong columns and thus, not producing the right results. > Aggregation / Projection over Multi-Join Inner Query producing incorrect > results > > > Key: HIVE-10996 > URL: https://issues.apache.org/jira/browse/HIVE-10996 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.0.0, 1.2.0, 1.1.0, 1.3.0, 2.0.0 >Reporter: Gautam Kowshik >Assignee: Jesus Camacho Rodriguez >Priority: Critical > Attachments: HIVE-10996.01.patch, HIVE-10996.02.patch, > HIVE-10996.03.patch, HIVE-10996.patch, explain_q1.txt, explain_q2.txt > > > We see the following problem on 1.1.0 and 1.2.0 but not 0.13 which seems like > a regression. > The following query (Q1) produces no results: > {code} > select s > from ( > select last.*, action.st2, action.n > from ( > select purchase.s, purchase.timestamp, max (mevt.timestamp) as > last_stage_timestamp > from (select * from purchase_history) purchase > join (select * from cart_history) mevt > on purchase.s = mevt.s > where purchase.timestamp > mevt.timestamp > group by purchase.s, purchase.timestamp > ) last > join (select * from events) action > on last.s = action.s and last.last_stage_timestamp = action.timestamp > ) list; > {code} > While this one (Q2) does produce results : > {code} > select * > from ( > select last.*, action.st2, action.n > from ( > select purchase.s, purchase.timestamp, max (mevt.timestamp) as > last_stage_timestamp > from (select * from purchase_history) purchase > join (select * from cart_history) mevt > on purchase.s = mevt.s > where purchase.timestamp > mevt.timestamp > group by purchase.s, purchase.timestamp > ) last > join (select * from events) action > on last.s = action.s and last.last_stage_timestamp = action.timestamp > ) list; > 1 21 20 Bob 1234 > 1 31 30 Bob 1234 > 3 51 50 Jeff1234 > {code} > The setup to test this is: > {code} > create table purchase_history (s string, product string, price double, > timestamp int); > insert into purchase_history values ('1', 'Belt', 20.00, 21); > insert into purchase_history values ('1', 'Socks', 3.50, 31); > insert into purchase_history values ('3', 'Belt', 20.00, 51); > insert into purchase_history values ('4', 'Shirt', 15.50, 59); > create table cart_history (s string, cart_id int, timestamp int); > insert into cart_history values ('1', 1, 10); > insert into cart_history values ('1', 2, 20); > insert into cart_history values ('1', 3, 30); > insert into cart_history values ('1', 4, 40); > insert into cart_history values ('3', 5, 50); > insert into cart_history values ('4', 6, 60); > create table events (s string, st2 string, n int, timestamp int); > insert into events values ('1', 'Bob', 1234, 20); > insert into events values ('1', 'Bob', 1234, 30); > insert into events values ('1', 'Bob', 1234, 25); > insert into events values ('2', 'Sam', 1234, 30); > insert into events values ('3', 'Jeff', 1234, 50); > insert into events values ('4', 'Ted', 1234, 60); > {code} > I realize select * and select s are not all that interesting in this context > but what lead us to this issue was select count(distinct s) was not returning > results. The above queries are the simplified queries that produce the issue. > I will note that if I convert the inner join to a table and select from that > the issue does not appear. > Update: Found that turning off hive.optimize.remove.identity.project fixes > this issue. This optimization was introduced in > https://issues.apache.org/jira/browse/HIVE-8435 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8128) Improve Parquet Vectorization
[ https://issues.apache.org/jira/browse/HIVE-8128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593228#comment-14593228 ] Dong Chen commented on HIVE-8128: - Sure, I did a quick update on Hive using the new changes and the test results seems fine! Thanks, [~nezihyigitbasi] I will write complete code and try more cases to see what could be found next week. > Improve Parquet Vectorization > - > > Key: HIVE-8128 > URL: https://issues.apache.org/jira/browse/HIVE-8128 > Project: Hive > Issue Type: Sub-task >Reporter: Brock Noland >Assignee: Dong Chen > Fix For: parquet-branch > > Attachments: HIVE-8128-parquet.patch.POC, HIVE-8128.1-parquet.patch > > > NO PRECOMMIT TESTS > We'll want to do is finish the vectorization work (e.g. VectorizedOrcSerde, > VectorizedOrcSerde) which was partially done in HIVE-5998. > As discussed in PARQUET-131, we will work out Hive POC based on the new > Parquet vectorized API, and then finish the implementation after finilized. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-4605) Hive job fails while closing reducer output - Unable to rename
[ https://issues.apache.org/jira/browse/HIVE-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593185#comment-14593185 ] Benoit Perroud commented on HIVE-4605: -- I'm seeing this error, too, with Hive 0.13, but it's because another process deleted the {{_tmp.-ext-10001}} folder. So it's not really a bug from my perspective. To find the guilty process deleting the folder, have a look at hdfs audits file to figure out who deleted the folder. > Hive job fails while closing reducer output - Unable to rename > -- > > Key: HIVE-4605 > URL: https://issues.apache.org/jira/browse/HIVE-4605 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.13.1 > Environment: OS: 2.6.18-194.el5xen #1 SMP Fri Apr 2 15:34:40 EDT 2010 > x86_64 x86_64 x86_64 GNU/Linux > Hadoop 1.1.2 >Reporter: Link Qian >Assignee: Brock Noland > Attachments: HIVE-4605.patch > > > 1, create a table with ORC storage model > create table iparea_analysis_orc (network int, ip string, ) > stored as ORC; > 2, insert table iparea_analysis_orc select network, ip, , the script > success, but failed after add *OVERWRITE* keyword. the main error log list > as here. > ava.lang.RuntimeException: Hive Runtime Error while closing operators: Unable > to rename output from: > hdfs://qa3hop001.uucun.com:9000/tmp/hive-hadoop/hive_2013-05-24_15-11-06_511_7746839019590922068/_task_tmp.-ext-1/_tmp.00_0 > to: > hdfs://qa3hop001.uucun.com:9000/tmp/hive-hadoop/hive_2013-05-24_15-11-06_511_7746839019590922068/_tmp.-ext-1/00_0 > at > org.apache.hadoop.hive.ql.exec.ExecReducer.close(ExecReducer.java:317) > at > org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:530) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:421) > at org.apache.hadoop.mapred.Child$4.run(Child.java:255) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) > at org.apache.hadoop.mapred.Child.main(Child.java:249) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename > output from: > hdfs://qa3hop001.uucun.com:9000/tmp/hive-hadoop/hive_2013-05-24_15-11-06_511_7746839019590922068/_task_tmp.-ext-1/_tmp.00_0 > to: > hdfs://qa3hop001.uucun.com:9000/tmp/hive-hadoop/hive_2013-05-24_15-11-06_511_7746839019590922068/_tmp.-ext-1/00_0 > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.commit(FileSinkOperator.java:197) > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.access$300(FileSinkOperator.java:108) > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:867) > at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:588) > at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:597) > at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:597) > at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:597) > at > org.apache.hadoop.hive.ql.exec.ExecReducer.close(ExecReducer.java:309) > ... 7 more -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11037) HiveOnTez: make explain user level = true as default
[ https://issues.apache.org/jira/browse/HIVE-11037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593170#comment-14593170 ] Hive QA commented on HIVE-11037: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12740570/HIVE-11037.02.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9011 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_2 org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4318/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4318/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4318/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12740570 - PreCommit-HIVE-TRUNK-Build > HiveOnTez: make explain user level = true as default > > > Key: HIVE-11037 > URL: https://issues.apache.org/jira/browse/HIVE-11037 > Project: Hive > Issue Type: Improvement >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-11037.01.patch, HIVE-11037.02.patch > > > In Hive-9780, we introduced a new level of explain for hive on tez. We would > like to make it running by default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11040) Change Derby dependency version to 10.10.2.0
[ https://issues.apache.org/jira/browse/HIVE-11040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-11040: Component/s: Metastore > Change Derby dependency version to 10.10.2.0 > > > Key: HIVE-11040 > URL: https://issues.apache.org/jira/browse/HIVE-11040 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Jason Dere >Assignee: Jason Dere > Fix For: 1.2.1, 2.0.0 > > Attachments: HIVE-11040.1.patch > > > We don't see this on the Apache pre-commit tests because it uses PTest, but > running the entire TestCliDriver suite results in failures in some of the > partition-related qtests (partition_coltype_literals, partition_date, > partition_date2). I've only really seen this on Linux (I was using CentOS). > HIVE-8879 changed the Derby dependency version from 10.10.1.1 to 10.11.1.1. > Testing with 10.10.1.1 or 10.20.2.0 seems to allow the partition related > tests to pass. I'd like to change the dependency version to 10.20.2.0, since > that version should also contain the fix for HIVE-8879. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11040) Change Derby dependency version to 10.10.2.0
[ https://issues.apache.org/jira/browse/HIVE-11040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-11040: Fix Version/s: 2.0.0 > Change Derby dependency version to 10.10.2.0 > > > Key: HIVE-11040 > URL: https://issues.apache.org/jira/browse/HIVE-11040 > Project: Hive > Issue Type: Bug >Reporter: Jason Dere >Assignee: Jason Dere > Fix For: 1.2.1, 2.0.0 > > Attachments: HIVE-11040.1.patch > > > We don't see this on the Apache pre-commit tests because it uses PTest, but > running the entire TestCliDriver suite results in failures in some of the > partition-related qtests (partition_coltype_literals, partition_date, > partition_date2). I've only really seen this on Linux (I was using CentOS). > HIVE-8879 changed the Derby dependency version from 10.10.1.1 to 10.11.1.1. > Testing with 10.10.1.1 or 10.20.2.0 seems to allow the partition related > tests to pass. I'd like to change the dependency version to 10.20.2.0, since > that version should also contain the fix for HIVE-8879. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11042) Need fix Utilities.replaceTaskId method
[ https://issues.apache.org/jira/browse/HIVE-11042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593141#comment-14593141 ] Chao Sun commented on HIVE-11042: - Also, comments for this method needs a little improvement. For instance, "for example" -> "For example", "This method, pattern is in taskId." -> "In this method, pattern is in taskId", etc. > Need fix Utilities.replaceTaskId method > --- > > Key: HIVE-11042 > URL: https://issues.apache.org/jira/browse/HIVE-11042 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 1.2.0 >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen > Attachments: HIVE-11042.1.patch > > > When I are looking at other bug, I found Utilities.replaceTaskId (String, > int) method is not right. > For example > Utilities.replaceTaskId"(ds%3D1)01", 5); > return 5 > It should return (ds%3D1)05 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11042) Need fix Utilities.replaceTaskId method
[ https://issues.apache.org/jira/browse/HIVE-11042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593140#comment-14593140 ] Chao Sun commented on HIVE-11042: - This code overlaps a lot with the existing replaceTaskId(String, String). Is it possible to just modify that method? > Need fix Utilities.replaceTaskId method > --- > > Key: HIVE-11042 > URL: https://issues.apache.org/jira/browse/HIVE-11042 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 1.2.0 >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen > Attachments: HIVE-11042.1.patch > > > When I are looking at other bug, I found Utilities.replaceTaskId (String, > int) method is not right. > For example > Utilities.replaceTaskId"(ds%3D1)01", 5); > return 5 > It should return (ds%3D1)05 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10746) Hive 1.2.0+Tez produces 1-byte FileSplits from mapred.TextInputFormat
[ https://issues.apache.org/jira/browse/HIVE-10746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-10746: Fix Version/s: 2.0.0 > Hive 1.2.0+Tez produces 1-byte FileSplits from mapred.TextInputFormat > -- > > Key: HIVE-10746 > URL: https://issues.apache.org/jira/browse/HIVE-10746 > Project: Hive > Issue Type: Bug > Components: Hive, Tez >Affects Versions: 0.14.0, 0.14.1, 1.2.0, 1.1.0, 1.1.1 >Reporter: Greg Senia >Assignee: Gopal V >Priority: Critical > Fix For: 1.2.1, 2.0.0 > > Attachments: HIVE-10746.1.patch, HIVE-10746.2.patch, > slow_query_output.zip > > > The following query: > {code:sql} > SELECT appl_user_id, arsn_cd, COUNT(*) as RecordCount FROM adw.crc_arsn GROUP > BY appl_user_id,arsn_cd ORDER BY appl_user_id; > {code} > runs consistently fast in Spark and Mapreduce on Hive 1.2.0. When attempting > to run this same query against Tez as the execution engine it consistently > runs for over 300-500 seconds this seems extremely long. This is a basic > external table delimited by tabs and is a single file in a folder. In Hive > 0.13 this query with Tez runs fast and I tested with Hive 0.14, 0.14.1/1.0.0 > and now Hive 1.2.0 and there clearly is something going awry with Hive w/Tez > as an execution engine with Single or small file tables. I can attach further > logs if someone needs them for deeper analysis. > HDFS Output: > {noformat} > hadoop fs -ls /example_dw/crc/arsn > Found 2 items > -rwxr-x--- 6 loaduser hadoopusers 0 2015-05-17 20:03 > /example_dw/crc/arsn/_SUCCESS > -rwxr-x--- 6 loaduser hadoopusers3883880 2015-05-17 20:03 > /example_dw/crc/arsn/part-m-0 > {noformat} > Hive Table Describe: > {noformat} > hive> describe formatted crc_arsn; > OK > # col_name data_type comment > > arsn_cd string > clmlvl_cd string > arclss_cd string > arclssg_cd string > arsn_prcsr_rmk_ind string > arsn_mbr_rspns_ind string > savtyp_cd string > arsn_eff_dt string > arsn_exp_dt string > arsn_pstd_dts string > arsn_lstupd_dts string > arsn_updrsn_txt string > appl_user_idstring > arsntyp_cd string > pre_d_indicator string > arsn_display_txtstring > arstat_cd string > arsn_tracking_nostring > arsn_cstspcfc_ind string > arsn_mstr_rcrd_ind string > state_specific_ind string > region_specific_in string > arsn_dpndnt_cd string > unit_adjustment_in string > arsn_mbr_only_ind string > arsn_qrmb_ind string > > # Detailed Table Information > Database: adw > Owner: loadu...@exa.example.com > CreateTime: Mon Apr 28 13:28:05 EDT 2014 > LastAccessTime: UNKNOWN > Protect Mode: None > Retention: 0 > Location: hdfs://xhadnnm1p.example.com:8020/example_dw/crc/arsn > > Table Type: EXTERNAL_TABLE > Table Parameters: > EXTERNALTRUE > transient_lastDdlTime 1398706085 > > # Storage Information > SerDe Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > > InputFormat:org.apache.hadoop.mapred.TextInputFormat > OutputFormat: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > Compressed: No > Num Buckets: