[jira] [Updated] (HIVE-18352) introduce a METADATAONLY option while doing REPL DUMP to allow integrations of other tools
[ https://issues.apache.org/jira/browse/HIVE-18352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-18352: -- Labels: pull-request-available (was: ) > introduce a METADATAONLY option while doing REPL DUMP to allow integrations > of other tools > --- > > Key: HIVE-18352 > URL: https://issues.apache.org/jira/browse/HIVE-18352 > Project: Hive > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: anishek >Assignee: anishek > Labels: pull-request-available > Fix For: 3.0.0 > > Attachments: HIVE-18352.0.patch > > > * Introduce a METADATAONLY option as part of the REPL DUMP command which will > only try and dump out events for DDL changes, this will be faster as we wont > need scan of files on HDFS for DML changes. > * Additionally since we are only going to dump metadata operations, it might > be useful to include acid tables as well via an option as well. This option > can be removed when ACID support is complete via HIVE-18320 > it will be good to support the "WITH" clause as part of REPL DUMP command as > well (repl dump already supports it viaHIVE-17757) to achieve the above as > that will prevent less changes to the syntax of the statement and provide > more flexibility in future to include additional options as well. > {code} > REPL DUMP [db_name] {FROM [event_id]} {TO [event_id]} {WITH > (['key'='value'],.)} > {code} > This will enable other tools like security / schema registry / metadata > discovery to use replication related subsystem for their needs as well. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-18352) introduce a METADATAONLY option while doing REPL DUMP to allow integrations of other tools
[ https://issues.apache.org/jira/browse/HIVE-18352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312651#comment-16312651 ] ASF GitHub Bot commented on HIVE-18352: --- GitHub user anishek opened a pull request: https://github.com/apache/hive/pull/286 HIVE-18352: introduce a METADATAONLY option while doing REPL DUMP to allow integrations of other tools You can merge this pull request into a Git repository by running: $ git pull https://github.com/anishek/hive HIVE-18352 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/hive/pull/286.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #286 commit c03814bd857cfd70a40aa7a5ec674e73cbfc63f9 Author: Anishek Agarwal Date: 2018-01-03T10:27:04Z HIVE-18352: introduce a METADATAONLY option while doing REPL DUMP to allow integrations of other tools > introduce a METADATAONLY option while doing REPL DUMP to allow integrations > of other tools > --- > > Key: HIVE-18352 > URL: https://issues.apache.org/jira/browse/HIVE-18352 > Project: Hive > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: anishek >Assignee: anishek > Labels: pull-request-available > Fix For: 3.0.0 > > Attachments: HIVE-18352.0.patch > > > * Introduce a METADATAONLY option as part of the REPL DUMP command which will > only try and dump out events for DDL changes, this will be faster as we wont > need scan of files on HDFS for DML changes. > * Additionally since we are only going to dump metadata operations, it might > be useful to include acid tables as well via an option as well. This option > can be removed when ACID support is complete via HIVE-18320 > it will be good to support the "WITH" clause as part of REPL DUMP command as > well (repl dump already supports it viaHIVE-17757) to achieve the above as > that will prevent less changes to the syntax of the statement and provide > more flexibility in future to include additional options as well. > {code} > REPL DUMP [db_name] {FROM [event_id]} {TO [event_id]} {WITH > (['key'='value'],.)} > {code} > This will enable other tools like security / schema registry / metadata > discovery to use replication related subsystem for their needs as well. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-18221) test acid default
[ https://issues.apache.org/jira/browse/HIVE-18221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312644#comment-16312644 ] Hive QA commented on HIVE-18221: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12904685/HIVE-18221.23.patch {color:green}SUCCESS:{color} +1 due to 11 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 342 failed/errored test(s), 10959 tests executed *Failed tests:* {noformat} TestAvroHCatLoader - did not produce a TEST-*.xml file (likely timed out) (batchId=190) TestAvroHCatStorer - did not produce a TEST-*.xml file (likely timed out) (batchId=190) TestBeeLineWithArgs - did not produce a TEST-*.xml file (likely timed out) (batchId=228) TestBeelineConnectionUsingHiveSite - did not produce a TEST-*.xml file (likely timed out) (batchId=228) TestBeelinePasswordOption - did not produce a TEST-*.xml file (likely timed out) (batchId=228) TestBeelineWithUserHs2ConnectionFile - did not produce a TEST-*.xml file (likely timed out) (batchId=228) TestCopyUtils - did not produce a TEST-*.xml file (likely timed out) (batchId=225) TestCustomAuthentication - did not produce a TEST-*.xml file (likely timed out) (batchId=228) TestDbNotificationListener - did not produce a TEST-*.xml file (likely timed out) (batchId=239) TestDefaultHCatRecord - did not produce a TEST-*.xml file (likely timed out) (batchId=198) TestE2EScenarios - did not produce a TEST-*.xml file (likely timed out) (batchId=190) TestHCatDynamicPartitioned - did not produce a TEST-*.xml file (likely timed out) (batchId=194) TestHCatExternalDynamicPartitioned - did not produce a TEST-*.xml file (likely timed out) (batchId=196) TestHCatExternalNonPartitioned - did not produce a TEST-*.xml file (likely timed out) (batchId=197) TestHCatExternalPartitioned - did not produce a TEST-*.xml file (likely timed out) (batchId=193) TestHCatHiveCompatibility - did not produce a TEST-*.xml file (likely timed out) (batchId=239) TestHCatHiveThriftCompatibility - did not produce a TEST-*.xml file (likely timed out) (batchId=239) TestHCatInputFormat - did not produce a TEST-*.xml file (likely timed out) (batchId=197) TestHCatInputFormatMethods - did not produce a TEST-*.xml file (likely timed out) (batchId=197) TestHCatLoaderComplexSchema - did not produce a TEST-*.xml file (likely timed out) (batchId=190) TestHCatLoaderEncryption - did not produce a TEST-*.xml file (likely timed out) (batchId=190) TestHCatLoaderStorer - did not produce a TEST-*.xml file (likely timed out) (batchId=190) TestHCatMultiOutputFormat - did not produce a TEST-*.xml file (likely timed out) (batchId=197) TestHCatMutableDynamicPartitioned - did not produce a TEST-*.xml file (likely timed out) (batchId=191) TestHCatMutableNonPartitioned - did not produce a TEST-*.xml file (likely timed out) (batchId=197) TestHCatMutablePartitioned - did not produce a TEST-*.xml file (likely timed out) (batchId=195) TestHCatNonPartitioned - did not produce a TEST-*.xml file (likely timed out) (batchId=192) TestHCatOutputFormat - did not produce a TEST-*.xml file (likely timed out) (batchId=197) TestHCatPartitionPublish - did not produce a TEST-*.xml file (likely timed out) (batchId=192) TestHCatPartitioned - did not produce a TEST-*.xml file (likely timed out) (batchId=192) TestHCatSchema - did not produce a TEST-*.xml file (likely timed out) (batchId=198) TestHCatSchemaUtils - did not produce a TEST-*.xml file (likely timed out) (batchId=198) TestHCatStorerMulti - did not produce a TEST-*.xml file (likely timed out) (batchId=190) TestHCatStorerWrapper - did not produce a TEST-*.xml file (likely timed out) (batchId=190) TestHiveClientCache - did not produce a TEST-*.xml file (likely timed out) (batchId=197) TestInputJobInfo - did not produce a TEST-*.xml file (likely timed out) (batchId=197) TestJsonSerDe - did not produce a TEST-*.xml file (likely timed out) (batchId=198) TestLazyHCatRecord - did not produce a TEST-*.xml file (likely timed out) (batchId=198) TestMultiOutputFormat - did not produce a TEST-*.xml file (likely timed out) (batchId=197) TestNotificationListener - did not produce a TEST-*.xml file (likely timed out) (batchId=200) TestOrcHCatLoader - did not produce a TEST-*.xml file (likely timed out) (batchId=190) TestOrcHCatStorer - did not produce a TEST-*.xml file (likely timed out) (batchId=190) TestParquetHCatLoader - did not produce a TEST-*.xml file (likely timed out) (batchId=190) TestParquetHCatStorer - did not produce a TEST-*.xml file (likely timed out) (batchId=190) TestPassProperties - did not produce a TEST-*.xml file (likely timed out) (batchId=192) TestPigHCatUtil - did not produce a TEST-*.xml file (likely timed out) (batchId=190) TestRCFileHCatLoader - did not produce a TEST-*.xml file (likely timed out) (batchId=190) TestRCFileHCatStorer - did not produce a TEST-
[jira] [Updated] (HIVE-18381) Drop table operation isn't consider that hdfs acl privilege of the table location parent path
[ https://issues.apache.org/jira/browse/HIVE-18381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] youchuikai updated HIVE-18381: -- Status: Patch Available (was: In Progress) > Drop table operation isn't consider that hdfs acl privilege of the table > location parent path > --- > > Key: HIVE-18381 > URL: https://issues.apache.org/jira/browse/HIVE-18381 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.1.0 > Environment: hive-1.1.0-cdh5.8.4 >Reporter: youchuikai >Assignee: youchuikai > > {code:sql} > // the push user belong to the test_rw group > hive> dfs -getfacl /user/hive/warehouse1/test1.db; > # file: /user/hive/warehouse1/test1.db > # owner: root > # group: hive > user::rwx > group::rwx > group:test_r:r-x > group:test_rw:rwx > mask::rwx > other::--- > default:user::rwx > default:group::rwx > default:group:test_r:r-x > default:group:test_rw:rwx > default:mask::rwx > default:other::--- > hive> drop table test1.youck_66; > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Table metadata > not deleted since hdfs://nameservice-test1/user/hive/warehouse1/test1.db is > not writable by push) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Work started] (HIVE-18381) Drop table operation isn't consider that hdfs acl privilege of the table location parent path
[ https://issues.apache.org/jira/browse/HIVE-18381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-18381 started by youchuikai. - > Drop table operation isn't consider that hdfs acl privilege of the table > location parent path > --- > > Key: HIVE-18381 > URL: https://issues.apache.org/jira/browse/HIVE-18381 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.1.0 > Environment: hive-1.1.0-cdh5.8.4 >Reporter: youchuikai >Assignee: youchuikai > > {code:sql} > // the push user belong to the test_rw group > hive> dfs -getfacl /user/hive/warehouse1/test1.db; > # file: /user/hive/warehouse1/test1.db > # owner: root > # group: hive > user::rwx > group::rwx > group:test_r:r-x > group:test_rw:rwx > mask::rwx > other::--- > default:user::rwx > default:group::rwx > default:group:test_r:r-x > default:group:test_rw:rwx > default:mask::rwx > default:other::--- > hive> drop table test1.youck_66; > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Table metadata > not deleted since hdfs://nameservice-test1/user/hive/warehouse1/test1.db is > not writable by push) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-18381) Drop table operation isn't consider that hdfs acl privilege of the table location parent path
[ https://issues.apache.org/jira/browse/HIVE-18381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312602#comment-16312602 ] youchuikai commented on HIVE-18381: --- *fix this bug.* {code:java} Index: src/main/java/org/apache/hadoop/hive/metastore/Warehouse.java IDEA additional info: Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP <+>UTF-8 === --- src/main/java/org/apache/hadoop/hive/metastore/Warehouse.java (date 1515137061000) +++ src/main/java/org/apache/hadoop/hive/metastore/Warehouse.java (date 1515137079737) @@ -43,6 +43,8 @@ import org.apache.hadoop.fs.FileStatus; import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.permission.AclEntry; +import org.apache.hadoop.fs.permission.AclStatus; import org.apache.hadoop.fs.permission.FsAction; import org.apache.hadoop.hive.common.FileUtils; import org.apache.hadoop.hive.common.HiveStatsUtils; @@ -250,8 +252,10 @@ return false; } final FileStatus stat; +final AclStatus aclStas; try { stat = getFs(path).getFileStatus(path); +aclStas = getFs(path).getAclStatus(path); } catch (FileNotFoundException fnfe){ // File named by path doesn't exist; nothing to validate. return true; @@ -266,23 +270,38 @@ } catch (LoginException le) { throw new IOException(le); } -String user = ugi.getShortUserName(); +String user = ugi.getShortUserName(); // kaikai +String[] groups = ugi.getGroupNames(); // groups 获取的是metastore的组用户信息。 //check whether owner can delete if (stat.getOwner().equals(user) && stat.getPermission().getUserAction().implies(FsAction.WRITE)) { return true; } + //check whether group of the user can delete if (stat.getPermission().getGroupAction().implies(FsAction.WRITE)) { -String[] groups = ugi.getGroupNames(); if (ArrayUtils.contains(groups, stat.getGroup())) { return true; } } + //check whether others can delete (uncommon case!!) if (stat.getPermission().getOtherAction().implies(FsAction.WRITE)) { return true; } + +// add extra +List list = aclStas.getEntries(); +for (AclEntry aclEntry : list){ +if (aclEntry.getScope().toString() != "DEFAULT" && aclEntry.getPermission().implies(FsAction.WRITE) && aclEntry.getName() != "null"){ +if (aclEntry.getType().toString() == "USER" && aclEntry.getName().equals(user)){ +LOG.info("acl user is" + aclEntry.getName() + ";" + "hive cli user is " + user); +return true; +} else if (aclEntry.getType().toString() == "GROUP" && ArrayUtils.contains(groups, aclEntry.getName())){ +return true; +} +} +} return false; } /* {code} > Drop table operation isn't consider that hdfs acl privilege of the table > location parent path > --- > > Key: HIVE-18381 > URL: https://issues.apache.org/jira/browse/HIVE-18381 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.1.0 > Environment: hive-1.1.0-cdh5.8.4 >Reporter: youchuikai >Assignee: youchuikai > > {code:sql} > // the push user belong to the test_rw group > hive> dfs -getfacl /user/hive/warehouse1/test1.db; > # file: /user/hive/warehouse1/test1.db > # owner: root > # group: hive > user::rwx > group::rwx > group:test_r:r-x > group:test_rw:rwx > mask::rwx > other::--- > default:user::rwx > default:group::rwx > default:group:test_r:r-x > default:group:test_rw:rwx > default:mask::rwx > default:other::--- > hive> drop table test1.youck_66; > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Table metadata > not deleted since hdfs://nameservice-test1/user/hive/warehouse1/test1.db is > not writable by push) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17573) LLAP: JDK9 support fixes
[ https://issues.apache.org/jira/browse/HIVE-17573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312596#comment-16312596 ] liyunzhang commented on HIVE-17573: --- [~gopalv]: thanks for your reply and tool. bq.JDK9 seems to wake up the producer-consumer pair on the same NUMA zone (the IO elevator allocates, passes the array to the executor thread and executor passes it back instead of throwing it to GC deref). If I don't add {{-XX:+UseNUMA}}, I guess the optimization about NUMA handling will not benefit the query, is it right? UseNUMA is disabled by default. bq.the IO elevator allocates, passes the array to the executor thread and executor passes it back instead of throwing it to GC deref I guess this will reduce less GC. >From my test result, what i found is GC is less in JDK9 comparing JDK8 on Hive >on Spark in long >queries([link|https://docs.google.com/presentation/d/1cK9ZfUliAggH3NJzSvexTPwkXpbsM7Dm0o0kdmuFQUU/edit#slide=id.p]). > Maybe this is because G1GC is the default garbage collector and the purpose >of >[G1GC|https://docs.oracle.com/javase/9/gctuning/garbage-first-garbage-collector.htm#JSGCT-GUID-0394E76A-1A8F-425E-A0D0-B48A3DC82B42] > is less GC time. > LLAP: JDK9 support fixes > > > Key: HIVE-17573 > URL: https://issues.apache.org/jira/browse/HIVE-17573 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 3.0.0 >Reporter: Gopal V >Assignee: Gopal V > > The perf diff between JDK8 -> JDK9 seems to be significant. > TPC-H Q6 on JDK8 takes 32s on a single node + 1 Tb scale warehouse. > TPC-H Q6 on JDK9 takes 19s on the same host + same data. > The performance difference seems to come from better JIT and better NUMA > handling. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-4312) Make ORC SerDe support replace columns
[ https://issues.apache.org/jira/browse/HIVE-4312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312592#comment-16312592 ] Upendra Yadav commented on HIVE-4312: - Is there any plan to give this support? > Make ORC SerDe support replace columns > -- > > Key: HIVE-4312 > URL: https://issues.apache.org/jira/browse/HIVE-4312 > Project: Hive > Issue Type: Improvement > Components: File Formats >Affects Versions: 0.11.0 >Reporter: Kevin Wilfong > > In the alterTable method of DDLTask.java there is an explicit list of SerDes > which support the replace columns command. ORC should support this, at least > for partitioned tables, maybe not unpartitioned tables. > This may be as simple as adding it to that list, but I suspect some > significant changes will be needed to make this work the the > CombineHiveInputFormat (e.g. where are combined and one split has a column > stored as a string and in the other it is stored as an int). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-18353) CompactorMR should call jobclient.close() to trigger cleanup
[ https://issues.apache.org/jira/browse/HIVE-18353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312583#comment-16312583 ] Prabhu Joseph commented on HIVE-18353: -- [~thejas] [~ekoifman] Can you review this when you get time. The failing test cases looks not related. > CompactorMR should call jobclient.close() to trigger cleanup > > > Key: HIVE-18353 > URL: https://issues.apache.org/jira/browse/HIVE-18353 > Project: Hive > Issue Type: Bug > Components: Hive, Transactions >Affects Versions: 1.2.1 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph > Attachments: HIVE-18353.1.patch, HIVE-18353.2.patch, HIVE-18353.patch > > > HiveMetastore process is leaking TrustStore reloader threads when running > compaction as JobClient close is not called from CompactorMR - MAPREDUCE-6618 > and MAPREDUCE-6621 > {code} > "Truststore reloader thread" #2814 daemon prio=1 os_prio=0 > tid=0x00cdc800 nid=0x2f05a waiting on condition [0x7fdaef403000] >java.lang.Thread.State: TIMED_WAITING (sleeping) > at java.lang.Thread.sleep(Native Method) > at > org.apache.hadoop.security.ssl.ReloadingX509TrustManager.run(ReloadingX509TrustManager.java:194) > at java.lang.Thread.run(Thread.java:745) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-18221) test acid default
[ https://issues.apache.org/jira/browse/HIVE-18221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312581#comment-16312581 ] Hive QA commented on HIVE-18221: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 1s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 28s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 31s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 15s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 1s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 6m 44s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 19s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 25s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 35s{color} | {color:red} ql: The patch generated 3 new + 356 unchanged - 0 fixed = 359 total (was 356) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 36s{color} | {color:red} root: The patch generated 3 new + 356 unchanged - 0 fixed = 359 total (was 356) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 6m 55s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 12s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 50m 30s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense xml javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh | | git revision | master / 20c9a39 | | Default Java | 1.8.0_111 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-8453/yetus/diff-checkstyle-ql.txt | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-8453/yetus/diff-checkstyle-root.txt | | modules | C: ql hcatalog/core hcatalog/hcatalog-pig-adapter hcatalog/webhcat/java-client . itests/hive-unit U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-8453/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > test acid default > - > > Key: HIVE-18221 > URL: https://issues.apache.org/jira/browse/HIVE-18221 > Project: Hive > Issue Type: Test > Components: Transactions >Affects Versions: 3.0.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman > Attachments: HIVE-18221.01.patch, HIVE-18221.02.patch, > HIVE-18221.03.patch, HIVE-18221.04.patch, HIVE-18221.07.patch, > HIVE-18221.08.patch, HIVE-18221.09.patch, HIVE-18221.10.patch, > HIVE-18221.11.patch, HIVE-18221.12.patch, HIVE-18221.13.patch, > HIVE-18221.14.patch, HIVE-18221.16.patch, HIVE-18221.18.patch, > HIVE-18221.19.patch, HIVE-18221.20.patch, HIVE-18221.21.patch, > HIVE-18221.22.patch, HIVE-18221.23.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18359) Extend grouping set limits from int to long
[ https://issues.apache.org/jira/browse/HIVE-18359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-18359: - Attachment: HIVE-18359.3.patch > Extend grouping set limits from int to long > --- > > Key: HIVE-18359 > URL: https://issues.apache.org/jira/browse/HIVE-18359 > Project: Hive > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-18359.1.patch, HIVE-18359.2.patch, > HIVE-18359.3.patch > > > Grouping sets is broken for >32 columns because of usage of Int for bitmap > (also GROUPING__ID virtual column). This assumption breaks grouping > sets/rollups/cube when number of participating aggregation columns is >32. > The easier fix would be extend it to Long for now. The correct fix would be > to use BitSets everywhere but that would require GROUPING__ID column type to > binary which will make predicates on GROUPING__ID difficult to deal with. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-18368) Improve Spark Debug RDD Graph
[ https://issues.apache.org/jira/browse/HIVE-18368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312522#comment-16312522 ] Hive QA commented on HIVE-18368: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12904682/HIVE-18368.1.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 18 failed/errored test(s), 11547 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join25] (batchId=72) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=35) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucketsortoptimize_insert_2] (batchId=151) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[hybridgrace_hashjoin_2] (batchId=156) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=164) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] (batchId=168) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast] (batchId=159) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] (batchId=159) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[authorization_part] (batchId=93) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[stats_aggregator_error_1] (batchId=93) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] (batchId=120) org.apache.hadoop.hive.metastore.TestEmbeddedHiveMetaStore.testTransactionalValidation (batchId=213) org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=253) org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints (batchId=225) org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=231) org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=231) org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=231) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8452/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8452/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8452/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 18 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12904682 - PreCommit-HIVE-Build > Improve Spark Debug RDD Graph > - > > Key: HIVE-18368 > URL: https://issues.apache.org/jira/browse/HIVE-18368 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Attachments: HIVE-18368.1.patch, Spark UI - Named RDDs.png > > > The {{SparkPlan}} class does some logging to show the mapping between > different {{SparkTran}}, what shuffle types are used, and what trans are > cached. However, there is room for improvement. > When debug logging is enabled the RDD graph is logged, but there isn't much > information printed about each RDD. > We should combine both of the graphs and improve them. We could even make the > Spark Plan graph part of the {{EXPLAIN EXTENDED}} output. > Ideally, the final graph shows a clear relationship between Tran objects, > RDDs, and BaseWorks. Edge should include information about number of > partitions, shuffle types, Spark operations used, etc. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-18375) Cannot ORDER by subquery fields unless they are selected
[ https://issues.apache.org/jira/browse/HIVE-18375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312508#comment-16312508 ] Gopal V commented on HIVE-18375: [~pauljackson123]:the first two queries run on HDP3, there's probably a fix that went in for this which isn't in hive-2 branch. {code} 0: jdbc:hive2://localhost:10007/tpcds_bin_par> EXPLAIN SELECT `first_name` `F_4`, `last_name` `F_5` 0: jdbc:hive2://localhost:10007/tpcds_bin_par> FROM `employees` 0: jdbc:hive2://localhost:10007/tpcds_bin_par> ORDER BY `emp_no` DESC; Plan optimized by CBO. Vertex dependency in root stage Reducer 2 <- Map 1 (SIMPLE_EDGE) Stage-0 Fetch Operator limit:-1 Stage-1 Reducer 2 vectorized, llap File Output Operator [FS_9] Select Operator [SEL_8] (rows=6 width=202) Output:["_col0","_col1"] <-Map 1 [SIMPLE_EDGE] vectorized, llap SHUFFLE [RS_7] Select Operator [SEL_6] (rows=6 width=202) Output:["_col0","_col1","_col2"] TableScan [TS_0] (rows=6 width=202) testing@employees,employees,Tbl:COMPLETE,Col:NONE,Output:["first_name","last_name","emp_no"] {code} {code} 0: jdbc:hive2://localhost:10007/tpcds_bin_par> 0: jdbc:hive2://localhost:10007/tpcds_bin_par> EXPLAIN SELECT `first_name` `F_4`, `emp_no` `F_3`, `last_name` `F_5` 0: jdbc:hive2://localhost:10007/tpcds_bin_par> FROM `employees` 0: jdbc:hive2://localhost:10007/tpcds_bin_par> ORDER BY `emp_no` DESC; Plan optimized by CBO. Vertex dependency in root stage Reducer 2 <- Map 1 (SIMPLE_EDGE) Stage-0 Fetch Operator limit:-1 Stage-1 Reducer 2 vectorized, llap File Output Operator [FS_8] Select Operator [SEL_7] (rows=6 width=202) Output:["_col0","_col1","_col2"] <-Map 1 [SIMPLE_EDGE] vectorized, llap SHUFFLE [RS_6] Select Operator [SEL_5] (rows=6 width=202) Output:["_col0","_col1","_col2"] TableScan [TS_0] (rows=6 width=202) testing@employees,employees,Tbl:COMPLETE,Col:NONE,Output:["first_name","emp_no","last_name"] {code} > Cannot ORDER by subquery fields unless they are selected > > > Key: HIVE-18375 > URL: https://issues.apache.org/jira/browse/HIVE-18375 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 2.3.2 > Environment: Amazon AWS > Release label:emr-5.11.0 > Hadoop distribution:Amazon 2.7.3 > Applications:Hive 2.3.2, Pig 0.17.0, Hue 4.0.1 > classification=hive-site,properties=[hive.strict.checks.cartesian.product=false,hive.mapred.mode=nonstrict] >Reporter: Paul Jackson >Priority: Minor > > Give these tables: > {code:SQL} > CREATE TABLE employees ( > emp_no INT, > first_name VARCHAR(14), > last_name VARCHAR(16) > ); > insert into employees values > (1, 'Gottlob', 'Frege'), > (2, 'Bertrand', 'Russell'), > (3, 'Ludwig', 'Wittgenstein'); > CREATE TABLE salaries ( > emp_no INT, > salary INT, > from_date DATE, > to_date DATE > ); > insert into salaries values > (1, 10, '1900-01-01', '1900-01-31'), > (1, 18, '1900-09-01', '1900-09-30'), > (2, 15, '1940-03-01', '1950-01-01'), > (3, 20, '1920-01-01', '1950-01-01'); > {code} > This query returns the names of the employees ordered by their peak salary: > {code:SQL} > SELECT `employees`.`last_name`, `employees`.`first_name`, `t1`.`max_salary` > FROM `default`.`employees` > INNER JOIN > (SELECT `emp_no`, MAX(`salary`) `max_salary` > FROM `default`.`salaries` > WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL > GROUP BY `emp_no`) AS `t1` > ON `employees`.`emp_no` = `t1`.`emp_no` > ORDER BY `t1`.`max_salary` DESC; > {code} > However, this should still work even if the max_salary is not part of the > projection: > {code:SQL} > SELECT `employees`.`last_name`, `employees`.`first_name` > FROM `default`.`employees` > INNER JOIN > (SELECT `emp_no`, MAX(`salary`) `max_salary` > FROM `default`.`salaries` > WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL > GROUP BY `emp_no`) AS `t1` > ON `employees`.`emp_no` = `t1`.`emp_no` > ORDER BY `t1`.`max_salary` DESC; > {code} > However, that fails with this error: > {code} > Error while compiling statement: FAILED: SemanticException [Error 10004]: > line 9:9 Invalid table alias or column reference 't1': (possible column names > are: last_name, first_name) > {code} > FWIW, this also fails: > {code:SQL} > SELECT `employees`.`last_name`, `employees`.`first_name`, `t1`.`max_salary` > AS `max_sal` > FROM `default`.`employees` > INNER JOIN > (SELECT `emp_no`, MAX(`salary`) `max_salary` > FROM `default`.`salaries` > WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL > GROUP BY `emp_no`) AS `t1` > ON `employees`.`emp_no` = `t1`.`
[jira] [Updated] (HIVE-18350) load data should rename files consistent with insert statements
[ https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepak Jaiswal updated HIVE-18350: -- Status: Patch Available (was: Open) > load data should rename files consistent with insert statements > --- > > Key: HIVE-18350 > URL: https://issues.apache.org/jira/browse/HIVE-18350 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > Attachments: HIVE-18350.1.patch > > > Insert statements create files of format ending with _0, 0001_0 etc. > However, the load data uses the input file name. That results in inconsistent > naming convention which makes SMB joins difficult in some scenarios and may > cause trouble for other types of queries in future. > We need consistent naming convention. > For non-bucketed table, hive renames all the files regardless of how they > were named by the user. > For bucketed table, hive relies on user to name the files matching the bucket > in non-strict mode. Hive assumes that the data belongs to same bucket in a > file. In strict mode, loading bucketed table is disabled. > This will likely affect most of the tests which load data which is pretty > significant. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18350) load data should rename files consistent with insert statements
[ https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepak Jaiswal updated HIVE-18350: -- Attachment: HIVE-18350.1.patch Only contains changes for bucketed tables. > load data should rename files consistent with insert statements > --- > > Key: HIVE-18350 > URL: https://issues.apache.org/jira/browse/HIVE-18350 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > Attachments: HIVE-18350.1.patch > > > Insert statements create files of format ending with _0, 0001_0 etc. > However, the load data uses the input file name. That results in inconsistent > naming convention which makes SMB joins difficult in some scenarios and may > cause trouble for other types of queries in future. > We need consistent naming convention. > For non-bucketed table, hive renames all the files regardless of how they > were named by the user. > For bucketed table, hive relies on user to name the files matching the bucket > in non-strict mode. Hive assumes that the data belongs to same bucket in a > file. In strict mode, loading bucketed table is disabled. > This will likely affect most of the tests which load data which is pretty > significant. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-18214) Flaky test: TestSparkClient
[ https://issues.apache.org/jira/browse/HIVE-18214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312482#comment-16312482 ] Sahil Takiar commented on HIVE-18214: - [~aihuaxu] yes thats correct. It sends a shutdown message to the {{RemoteDriver}} asynchronously. Then it creates another {{RemoteDriver}}, which leads to the exception. Yeah, we could add logic to do that, but again its not something that would happen in production because every {{RemoteDriver}} is spawned in a separate container. The {{RemoteDriver#main(String args[])}} is run in a YARN container. And each {{RemoteDriver}} creates a single {{SparkContext}} in its constructor. We could just change {{TestSparkClient}} so that it always spawns the {{RemoteDriver}} in a separate process, I checked and it only makes the test take an extra 20 seconds. The code to run the {{RemoteDriver}} in the local-process was only ever meant for test purposes. > Flaky test: TestSparkClient > --- > > Key: HIVE-18214 > URL: https://issues.apache.org/jira/browse/HIVE-18214 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Attachments: HIVE-18214.1.patch > > > Looks like there is a race condition in {{TestSparkClient#runTest}}. The test > creates a {{RemoteDriver}} in memory, which creates a {{JavaSparkContext}}. A > new {{JavaSparkContext}} is created for each test that is run. There is a > race condition where the {{RemoteDriver}} isn't given enough time to > shutdown, so when the next test starts running it creates another > {{JavaSparkContext}} which causes an exception like > {{org.apache.spark.SparkException: Only one SparkContext may be running in > this JVM (see SPARK-2243)}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-18368) Improve Spark Debug RDD Graph
[ https://issues.apache.org/jira/browse/HIVE-18368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312478#comment-16312478 ] Hive QA commented on HIVE-18368: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 45s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 56s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 29s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 50s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 32s{color} | {color:red} ql: The patch generated 1 new + 54 unchanged - 9 fixed = 55 total (was 63) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 15s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 13m 3s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh | | git revision | master / 20c9a39 | | Default Java | 1.8.0_111 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-8452/yetus/diff-checkstyle-ql.txt | | modules | C: ql U: ql | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-8452/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Improve Spark Debug RDD Graph > - > > Key: HIVE-18368 > URL: https://issues.apache.org/jira/browse/HIVE-18368 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Attachments: HIVE-18368.1.patch, Spark UI - Named RDDs.png > > > The {{SparkPlan}} class does some logging to show the mapping between > different {{SparkTran}}, what shuffle types are used, and what trans are > cached. However, there is room for improvement. > When debug logging is enabled the RDD graph is logged, but there isn't much > information printed about each RDD. > We should combine both of the graphs and improve them. We could even make the > Spark Plan graph part of the {{EXPLAIN EXTENDED}} output. > Ideally, the final graph shows a clear relationship between Tran objects, > RDDs, and BaseWorks. Edge should include information about number of > partitions, shuffle types, Spark operations used, etc. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-18061) q.outs: be more selective with masikng hdfs paths
[ https://issues.apache.org/jira/browse/HIVE-18061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312469#comment-16312469 ] Hive QA commented on HIVE-18061: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12904598/HIVE-18061.02.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 172 failed/errored test(s), 11548 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join25] (batchId=72) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mapjoin_hook] (batchId=12) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=35) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys] (batchId=175) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_unencrypted_nonhdfs_external_tables] (batchId=173) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[acid_bucket_pruning] (batchId=147) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[bucket5] (batchId=149) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[bucket6] (batchId=146) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[cte_2] (batchId=148) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[cte_4] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[dynamic_partition_pruning_2] (batchId=148) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[dynamic_semijoin_user_level] (batchId=149) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[empty_dir_in_table] (batchId=146) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[except_distinct] (batchId=149) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[explainuser_2] (batchId=149) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[external_table_with_space_in_location_path] (batchId=149) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[file_with_header_footer] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[global_limit] (batchId=148) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[import_exported_table] (batchId=149) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[insert_into1] (batchId=146) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[insert_into2] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[intersect_all] (batchId=148) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[intersect_distinct] (batchId=149) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_nullscan] (batchId=146) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_stats] (batchId=146) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llapdecider] (batchId=146) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[load_fs2] (batchId=149) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[load_hdfs_file_with_space_in_the_name] (batchId=146) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[mapreduce1] (batchId=149) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[mapreduce2] (batchId=146) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[mm_all] (batchId=149) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[multi_count_distinct_null] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_llap_counters1] (batchId=146) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_llap_counters] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_merge10] (batchId=149) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_merge1] (batchId=146) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_merge2] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_merge3] (batchId=146) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_merge4] (batchId=149) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_merge_diff_fs] (batchId=146) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] (batchId=146) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a] (batchId=148) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[parallel_colstats] (batchId=148) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[parquet_complex_types_vectorization] (batchId=149) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[parquet_map_type_vectorization] (batchId=
[jira] [Commented] (HIVE-18359) Extend grouping set limits from int to long
[ https://issues.apache.org/jira/browse/HIVE-18359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312461#comment-16312461 ] Pengcheng Xiong commented on HIVE-18359: LGTM +1 pending tests. :) > Extend grouping set limits from int to long > --- > > Key: HIVE-18359 > URL: https://issues.apache.org/jira/browse/HIVE-18359 > Project: Hive > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-18359.1.patch, HIVE-18359.2.patch > > > Grouping sets is broken for >32 columns because of usage of Int for bitmap > (also GROUPING__ID virtual column). This assumption breaks grouping > sets/rollups/cube when number of participating aggregation columns is >32. > The easier fix would be extend it to Long for now. The correct fix would be > to use BitSets everywhere but that would require GROUPING__ID column type to > binary which will make predicates on GROUPING__ID difficult to deal with. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-18061) q.outs: be more selective with masikng hdfs paths
[ https://issues.apache.org/jira/browse/HIVE-18061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312446#comment-16312446 ] Hive QA commented on HIVE-18061: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 21s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 43s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 24s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 46s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 10s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 21s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 32s{color} | {color:green} The patch ql passed checkstyle {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 14s{color} | {color:green} itests/util: The patch generated 0 new + 188 unchanged - 6 fixed = 188 total (was 194) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 11s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 12s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 16m 28s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh | | git revision | master / 20c9a39 | | Default Java | 1.8.0_111 | | modules | C: ql itests/util U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-8451/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > q.outs: be more selective with masikng hdfs paths > - > > Key: HIVE-18061 > URL: https://issues.apache.org/jira/browse/HIVE-18061 > Project: Hive > Issue Type: Improvement >Reporter: Zoltan Haindrich >Assignee: Laszlo Bodor > Attachments: HIVE-18061.01.patch, HIVE-18061.02.patch > > > currently any line which contains a path which looks like an hdfs location is > replaced with a "masked pattern was here"... > it might be releavant to record these messages; since even an exception > message might contain an hdfs location > noticed in > HIVE-18012 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18359) Extend grouping set limits from int to long
[ https://issues.apache.org/jira/browse/HIVE-18359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-18359: - Attachment: HIVE-18359.2.patch > Extend grouping set limits from int to long > --- > > Key: HIVE-18359 > URL: https://issues.apache.org/jira/browse/HIVE-18359 > Project: Hive > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-18359.1.patch, HIVE-18359.2.patch > > > Grouping sets is broken for >32 columns because of usage of Int for bitmap > (also GROUPING__ID virtual column). This assumption breaks grouping > sets/rollups/cube when number of participating aggregation columns is >32. > The easier fix would be extend it to Long for now. The correct fix would be > to use BitSets everywhere but that would require GROUPING__ID column type to > binary which will make predicates on GROUPING__ID difficult to deal with. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-18238) Driver execution may not have configuration changing sideeffects
[ https://issues.apache.org/jira/browse/HIVE-18238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312432#comment-16312432 ] Hive QA commented on HIVE-18238: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12904675/HIVE-18238.04wip01.patch {color:green}SUCCESS:{color} +1 due to 9 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 20 failed/errored test(s), 11121 tests executed *Failed tests:* {noformat} TestMiniLlapLocalCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=161) [dynamic_semijoin_reduction.q,materialized_view_create_rewrite_3.q,vectorization_pushdown.q,correlationoptimizer2.q,cbo_gby_empty.q,vectorization_short_regress.q,identity_project_remove_skip.q,mapjoin3.q,cross_product_check_1.q,unionDistinct_3.q,cbo_join.q,correlationoptimizer6.q,union_remove_26.q,cbo_rp_limit.q,vector_groupby_cube1.q,current_date_timestamp.q,union2.q,groupby2.q,schema_evol_text_vec_table.q,dynpart_sort_opt_vectorization.q,exchgpartition2lel.q,multiMapJoin1.q,sample10.q,vectorized_timestamp_ints_casts.q,vector_char_simple.q,auto_sortmerge_join_2.q,bucketizedhiveinputformat.q,vectorization_input_format_excludes.q,cte_mat_2.q,vectorization_8.q] TestNegativeCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=92) [nopart_insert.q,insert_into_with_schema.q,input41.q,having1.q,create_table_failure3.q,database_drop_not_empty_restrict.q,windowing_after_orderby.q,orderbysortby.q,subquery_select_distinct2.q,authorization_uri_alterpart_loc.q,udf_last_day_error_1.q,create_table_failure4.q,semijoin5.q,udf_format_number_wrong4.q,deletejar.q,exim_11_nonpart_noncompat_sorting.q,show_tables_bad_db2.q,drop_func_nonexistent.q,nopart_load.q,alter_table_non_partitioned_table_cascade.q,load_wrong_fileformat.q,lockneg_try_db_lock_conflict.q,udf_field_wrong_args_len.q,create_table_failure2.q,groupby2_map_skew_multi_distinct.q,udf_min.q,authorization_update_noupdatepriv.q,show_columns2.q,authorization_insert_noselectpriv.q,orc_replace_columns3_acid.q,udf_instr_wrong_args_len.q,compare_double_bigint.q,authorization_set_nonexistent_conf.q,alter_rename_partition_failure3.q,split_sample_wrong_format2.q,create_with_fk_pk_same_tab.q,authorization_show_roles_no_admin.q,materialized_view_authorization_rebuild_no_grant.q,unionLimit.q,authorization_revoke_table_fail2.q,authorization_insert_noinspriv.q,duplicate_insert3.q,authorization_desc_table_nosel.q,invalid_select_column.q,stats_noscan_non_native.q,orc_change_serde_acid.q,create_or_replace_view7.q,exim_07_nonpart_noncompat_ifof.q,udf_concat_ws_wrong2.q,fileformat_bad_class.q,merge_negative_2.q,exim_15_part_nonpart.q,authorization_not_owner_drop_view.q,external1.q,authorization_uri_insert.q,create_with_fk_wrong_ref.q,columnstats_tbllvl_incorrect_column.q,authorization_show_parts_nosel.q,merge_negative_1.q,authorization_not_owner_drop_tab.q,external2.q,authorization_deletejar.q,temp_table_create_like_partitions.q,udf_greatest_error_1.q,ptf_negative_AggrFuncsWithNoGBYNoPartDef.q,alter_view_as_select_not_exist.q,touch1.q,groupby3_map_skew_multi_distinct.q,exchange_partition_neg_partition_missing.q,groupby_cube_multi_gby.q,columnstats_tbllvl.q,drop_invalid_constraint2.q,alter_table_add_partition.q,update_not_acid.q,archive5.q,alter_table_constraint_invalid_pk_col.q,ivyDownload.q,udf_instr_wrong_type.q,bad_sample_clause.q,authorization_not_owner_drop_tab2.q,authorization_alter_db_owner.q,show_columns1.q,orc_type_promotion3.q,create_view_failure8.q,strict_join.q,udf_add_months_error_1.q,groupby_cube2.q,drop_partition_filter_failure.q,groupby_cube1.q,groupby_rollup1.q,genericFileFormat.q,authorization_create_macro1.q,invalid_cast_from_binary_4.q,drop_invalid_constraint1.q,serde_regex.q,show_partitions1.q,invalid_cast_from_binary_6.q,create_with_multi_pk_constraint.q,udf_field_wrong_type.q,groupby_grouping_sets4.q,groupby_grouping_sets3.q,load_data_into_acid.q,insertsel_fail.q,udf_locate_wrong_type.q,orc_type_promotion1_acid.q,set_table_property.q,create_or_replace_view2.q,groupby_grouping_sets2.q,alter_view_failure.q,distinct_windowing_failure1.q,invalid_t_alter2.q,alter_table_constraint_invalid_fk_col1.q,invalid_varchar_length_2.q,authorization_show_grant_otheruser_alltabs.q,subquery_windowing_corr.q,compact_non_acid_table.q,authorization_view_4.q,authorization_disallow_transform.q,materialized_view_authorization_rebuild_other.q,authorization_fail_4.q,dbtxnmgr_nodblock.q,set_hiveconf_internal_variable1.q,input_part0_neg.q,udf_printf_wrong3.q,load_orc_negative2.q,druid_buckets.q,archive2.q,authorization_addjar.q,invalid_sum_syntax.q,insert_into_with_schema1.q,udf_add_months_error_2.q,dyn_part_max_per_node.q,authorization_revoke_table_fail1.q,udf_printf_wrong2.q,archive_multi3.q,udf_printf_wrong1.q,subquery_subquery_chain.q,authorization_view_disable_cbo_4.q,no_matching_udf.q,char_pad_
[jira] [Assigned] (HIVE-18381) Drop table operation isn't consider that hdfs acl privilege of the table location parent path
[ https://issues.apache.org/jira/browse/HIVE-18381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] youchuikai reassigned HIVE-18381: - > Drop table operation isn't consider that hdfs acl privilege of the table > location parent path > --- > > Key: HIVE-18381 > URL: https://issues.apache.org/jira/browse/HIVE-18381 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.1.0 > Environment: hive-1.1.0-cdh5.8.4 >Reporter: youchuikai >Assignee: youchuikai > > {code:sql} > // the push user belong to the test_rw group > hive> dfs -getfacl /user/hive/warehouse1/test1.db; > # file: /user/hive/warehouse1/test1.db > # owner: root > # group: hive > user::rwx > group::rwx > group:test_r:r-x > group:test_rw:rwx > mask::rwx > other::--- > default:user::rwx > default:group::rwx > default:group:test_r:r-x > default:group:test_rw:rwx > default:mask::rwx > default:other::--- > hive> drop table test1.youck_66; > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Table metadata > not deleted since hdfs://nameservice-test1/user/hive/warehouse1/test1.db is > not writable by push) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Reopened] (HIVE-18326) LLAP Tez scheduler - only preempt tasks if there's a dependency between them
[ https://issues.apache.org/jira/browse/HIVE-18326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin reopened HIVE-18326: - Reverted the patch, looks like it breaks in some cases. I am looking; it appears that dag info doesn't have the entire dag, or smth like that, for some combination of union and multi-insert > LLAP Tez scheduler - only preempt tasks if there's a dependency between them > > > Key: HIVE-18326 > URL: https://issues.apache.org/jira/browse/HIVE-18326 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Fix For: 3.0.0 > > Attachments: HIVE-18326.01.patch, HIVE-18326.02.patch, > HIVE-18326.patch > > > It is currently possible for e.g. two sides of a union (or a join for that > matter) to have slightly different priorities. We don't want to preempt > running tasks on one side in favor of the other side in such cases. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-18328) Improve schematool validator to report duplicate rows for column statistics
[ https://issues.apache.org/jira/browse/HIVE-18328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312406#comment-16312406 ] Naveen Gangam commented on HIVE-18328: -- The test failures do not appear to be related to the patch. The previous builds have the same failures and some more. So +1 for me. [~aihuaxu] Could you please review this when you get a chance? Thanks > Improve schematool validator to report duplicate rows for column statistics > --- > > Key: HIVE-18328 > URL: https://issues.apache.org/jira/browse/HIVE-18328 > Project: Hive > Issue Type: Improvement > Components: Hive >Affects Versions: 2.1.1 >Reporter: Naveen Gangam >Assignee: Naveen Gangam > Attachments: HIVE-18328.patch > > > By design, in the {{TAB_COL_STATS}} table of the HMS schema, there should be > ONE AND ONLY ONE row, representing its statistics, for each column defined in > hive. A combination of DB_NAME, TABLE_NAME and COLUMN_NAME constitute a > primary key/unique row. > Each time the statistics are computed for a column, this row is updated. > However, if somehow via BDR/replication process, we end up with multiple > rows in this table for a given column, HMS server to recompute the statistics > there after. > So it would be good to detect this data anamoly via the schema validation > tool. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18361) Extend shared work optimizer to reuse computation beyond work boundaries
[ https://issues.apache.org/jira/browse/HIVE-18361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-18361: --- Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Regenerated q files and pushed to master, thanks for reviewing [~ashutoshc]! > Extend shared work optimizer to reuse computation beyond work boundaries > > > Key: HIVE-18361 > URL: https://issues.apache.org/jira/browse/HIVE-18361 > Project: Hive > Issue Type: New Feature > Components: Physical Optimizer >Affects Versions: 3.0.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Labels: TODOC3.0 > Fix For: 3.0.0 > > Attachments: HIVE-18361.01.patch, HIVE-18361.02.patch, > HIVE-18361.patch > > > Follow-up of the work in HIVE-16867. > HIVE-16867 introduced an optimization that identifies scans on input tables > that can be merged and reuses the computation that is done in the work > containing those scans. In particular, we traverse both parts of the plan > upstream and reuse the operators if possible. > Currently, the optimizer will not go beyond the output edge(s) of that work. > This extension removes that limitation. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-18238) Driver execution may not have configuration changing sideeffects
[ https://issues.apache.org/jira/browse/HIVE-18238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312399#comment-16312399 ] Hive QA commented on HIVE-18238: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 1s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 27s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 17s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 39s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 35s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 48s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 21s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 37s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 42s{color} | {color:red} ql: The patch generated 17 new + 1288 unchanged - 7 fixed = 1305 total (was 1295) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 8s{color} | {color:red} cli: The patch generated 1 new + 38 unchanged - 1 fixed = 39 total (was 39) {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 11s{color} | {color:green} hcatalog/core: The patch generated 0 new + 33 unchanged - 1 fixed = 33 total (was 34) {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 9s{color} | {color:green} The patch hcatalog-pig-adapter passed checkstyle {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 9s{color} | {color:green} The patch server-extensions passed checkstyle {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 11s{color} | {color:green} The patch hive-unit passed checkstyle {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 50s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 11s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 22m 59s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh | | git revision | master / 3f5148d | | Default Java | 1.8.0_111 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-8450/yetus/diff-checkstyle-ql.txt | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-8450/yetus/diff-checkstyle-cli.txt | | modules | C: ql cli hcatalog/core hcatalog/hcatalog-pig-adapter hcatalog/server-extensions itests/hive-unit U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-8450/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Driver execution may not have configuration changing sideeffects > - > > Key: HIVE-18238 > URL: https://issues.apache.org/jira/browse/HIVE-18238 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer >Reporter: Zoltan H
[jira] [Commented] (HIVE-18366) Update HBaseSerDe to use hbase.mapreduce.hfileoutputformat.table.name instead of hbase.table.name as the table name property
[ https://issues.apache.org/jira/browse/HIVE-18366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312374#comment-16312374 ] Hive QA commented on HIVE-18366: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12904671/HIVE-18366.1.patch {color:green}SUCCESS:{color} +1 due to 18 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 22 failed/errored test(s), 11517 tests executed *Failed tests:* {noformat} TestMiniLlapLocalCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=165) [vector_interval_2.q,schema_evol_orc_acid_table_update.q,metadataonly1.q,auto_join_nulls.q,metadata_only_queries_with_filters.q,schema_evol_text_nonvec_part_all_complex.q,alter_merge_orc.q,vector_between_columns.q,vector_char_cast.q,vector_groupby_grouping_sets6.q,join_filters.q,udaf_collect_set_2.q,update_after_multiple_inserts.q,offset_limit_ppd_optimizer.q,materialized_view_describe.q,orc_merge_incompat1.q,vectorized_parquet_types.q,vector_windowing_gby2.q,explainanalyze_2.q,vectorization_15.q,union7.q,vectorization_nested_udf.q,vector_char_2.q,schema_evol_orc_acidvec_part.q,vector_groupby_3.q,materialized_view_create_rewrite_multi_db.q,acid_no_buckets.q,cbo_rp_gby.q,auto_sortmerge_join_9.q,vector_groupby_grouping_id2.q] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join25] (batchId=72) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mapjoin_hook] (batchId=12) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=35) org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver[external_table_ppd] (batchId=96) org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver[hbase_binary_storage_queries] (batchId=99) org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver[hbase_ddl] (batchId=98) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucketsortoptimize_insert_2] (batchId=151) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[hybridgrace_hashjoin_2] (batchId=156) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=164) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] (batchId=168) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast] (batchId=159) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] (batchId=159) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[authorization_part] (batchId=93) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[stats_aggregator_error_1] (batchId=93) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] (batchId=120) org.apache.hadoop.hive.metastore.TestEmbeddedHiveMetaStore.testTransactionalValidation (batchId=213) org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=253) org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints (batchId=225) org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=231) org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=231) org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=231) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8449/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8449/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8449/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 22 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12904671 - PreCommit-HIVE-Build > Update HBaseSerDe to use hbase.mapreduce.hfileoutputformat.table.name instead > of hbase.table.name as the table name property > > > Key: HIVE-18366 > URL: https://issues.apache.org/jira/browse/HIVE-18366 > Project: Hive > Issue Type: Sub-task > Components: HBase Handler >Affects Versions: 3.0.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-18366.1.patch > > > HBase 2.0 changes the table name property to > hbase.mapreduce.hfileoutputformat.table.name. HiveHFileOutputFormat is using > the new property name while HiveHBaseTableOutputFormat is not. If we create > the table as follows, HiveHBaseTableOutputFormat is used which still uses the > old property hbase.table.name. > {noformat} > create table hbase_table2(key int, val string) s
[jira] [Commented] (HIVE-18375) Cannot ORDER by subquery fields unless they are selected
[ https://issues.apache.org/jira/browse/HIVE-18375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312371#comment-16312371 ] Pengcheng Xiong commented on HIVE-18375: [~pauljackson123], i am sorry but i saw that all of your above cases involve ORDER BY. Which simpler issue do you mean? > Cannot ORDER by subquery fields unless they are selected > > > Key: HIVE-18375 > URL: https://issues.apache.org/jira/browse/HIVE-18375 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 2.3.2 > Environment: Amazon AWS > Release label:emr-5.11.0 > Hadoop distribution:Amazon 2.7.3 > Applications:Hive 2.3.2, Pig 0.17.0, Hue 4.0.1 > classification=hive-site,properties=[hive.strict.checks.cartesian.product=false,hive.mapred.mode=nonstrict] >Reporter: Paul Jackson >Priority: Minor > > Give these tables: > {code:SQL} > CREATE TABLE employees ( > emp_no INT, > first_name VARCHAR(14), > last_name VARCHAR(16) > ); > insert into employees values > (1, 'Gottlob', 'Frege'), > (2, 'Bertrand', 'Russell'), > (3, 'Ludwig', 'Wittgenstein'); > CREATE TABLE salaries ( > emp_no INT, > salary INT, > from_date DATE, > to_date DATE > ); > insert into salaries values > (1, 10, '1900-01-01', '1900-01-31'), > (1, 18, '1900-09-01', '1900-09-30'), > (2, 15, '1940-03-01', '1950-01-01'), > (3, 20, '1920-01-01', '1950-01-01'); > {code} > This query returns the names of the employees ordered by their peak salary: > {code:SQL} > SELECT `employees`.`last_name`, `employees`.`first_name`, `t1`.`max_salary` > FROM `default`.`employees` > INNER JOIN > (SELECT `emp_no`, MAX(`salary`) `max_salary` > FROM `default`.`salaries` > WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL > GROUP BY `emp_no`) AS `t1` > ON `employees`.`emp_no` = `t1`.`emp_no` > ORDER BY `t1`.`max_salary` DESC; > {code} > However, this should still work even if the max_salary is not part of the > projection: > {code:SQL} > SELECT `employees`.`last_name`, `employees`.`first_name` > FROM `default`.`employees` > INNER JOIN > (SELECT `emp_no`, MAX(`salary`) `max_salary` > FROM `default`.`salaries` > WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL > GROUP BY `emp_no`) AS `t1` > ON `employees`.`emp_no` = `t1`.`emp_no` > ORDER BY `t1`.`max_salary` DESC; > {code} > However, that fails with this error: > {code} > Error while compiling statement: FAILED: SemanticException [Error 10004]: > line 9:9 Invalid table alias or column reference 't1': (possible column names > are: last_name, first_name) > {code} > FWIW, this also fails: > {code:SQL} > SELECT `employees`.`last_name`, `employees`.`first_name`, `t1`.`max_salary` > AS `max_sal` > FROM `default`.`employees` > INNER JOIN > (SELECT `emp_no`, MAX(`salary`) `max_salary` > FROM `default`.`salaries` > WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL > GROUP BY `emp_no`) AS `t1` > ON `employees`.`emp_no` = `t1`.`emp_no` > ORDER BY `t1`.`max_salary` DESC; > {code} > But this succeeds: > {code:SQL} > SELECT `employees`.`last_name`, `employees`.`first_name`, `t1`.`max_salary` > AS `max_sal` > FROM `default`.`employees` > INNER JOIN > (SELECT `emp_no`, MAX(`salary`) `max_salary` > FROM `default`.`salaries` > WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL > GROUP BY `emp_no`) AS `t1` > ON `employees`.`emp_no` = `t1`.`emp_no` > ORDER BY `max_sal` DESC; > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-18375) Cannot ORDER by subquery fields unless they are selected
[ https://issues.apache.org/jira/browse/HIVE-18375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312359#comment-16312359 ] Paul Jackson commented on HIVE-18375: - There is no doubt these are the same issue. What do you think about the simpler issue in my comment that does not involve ORDER BY? > Cannot ORDER by subquery fields unless they are selected > > > Key: HIVE-18375 > URL: https://issues.apache.org/jira/browse/HIVE-18375 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 2.3.2 > Environment: Amazon AWS > Release label:emr-5.11.0 > Hadoop distribution:Amazon 2.7.3 > Applications:Hive 2.3.2, Pig 0.17.0, Hue 4.0.1 > classification=hive-site,properties=[hive.strict.checks.cartesian.product=false,hive.mapred.mode=nonstrict] >Reporter: Paul Jackson >Priority: Minor > > Give these tables: > {code:SQL} > CREATE TABLE employees ( > emp_no INT, > first_name VARCHAR(14), > last_name VARCHAR(16) > ); > insert into employees values > (1, 'Gottlob', 'Frege'), > (2, 'Bertrand', 'Russell'), > (3, 'Ludwig', 'Wittgenstein'); > CREATE TABLE salaries ( > emp_no INT, > salary INT, > from_date DATE, > to_date DATE > ); > insert into salaries values > (1, 10, '1900-01-01', '1900-01-31'), > (1, 18, '1900-09-01', '1900-09-30'), > (2, 15, '1940-03-01', '1950-01-01'), > (3, 20, '1920-01-01', '1950-01-01'); > {code} > This query returns the names of the employees ordered by their peak salary: > {code:SQL} > SELECT `employees`.`last_name`, `employees`.`first_name`, `t1`.`max_salary` > FROM `default`.`employees` > INNER JOIN > (SELECT `emp_no`, MAX(`salary`) `max_salary` > FROM `default`.`salaries` > WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL > GROUP BY `emp_no`) AS `t1` > ON `employees`.`emp_no` = `t1`.`emp_no` > ORDER BY `t1`.`max_salary` DESC; > {code} > However, this should still work even if the max_salary is not part of the > projection: > {code:SQL} > SELECT `employees`.`last_name`, `employees`.`first_name` > FROM `default`.`employees` > INNER JOIN > (SELECT `emp_no`, MAX(`salary`) `max_salary` > FROM `default`.`salaries` > WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL > GROUP BY `emp_no`) AS `t1` > ON `employees`.`emp_no` = `t1`.`emp_no` > ORDER BY `t1`.`max_salary` DESC; > {code} > However, that fails with this error: > {code} > Error while compiling statement: FAILED: SemanticException [Error 10004]: > line 9:9 Invalid table alias or column reference 't1': (possible column names > are: last_name, first_name) > {code} > FWIW, this also fails: > {code:SQL} > SELECT `employees`.`last_name`, `employees`.`first_name`, `t1`.`max_salary` > AS `max_sal` > FROM `default`.`employees` > INNER JOIN > (SELECT `emp_no`, MAX(`salary`) `max_salary` > FROM `default`.`salaries` > WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL > GROUP BY `emp_no`) AS `t1` > ON `employees`.`emp_no` = `t1`.`emp_no` > ORDER BY `t1`.`max_salary` DESC; > {code} > But this succeeds: > {code:SQL} > SELECT `employees`.`last_name`, `employees`.`first_name`, `t1`.`max_salary` > AS `max_sal` > FROM `default`.`employees` > INNER JOIN > (SELECT `emp_no`, MAX(`salary`) `max_salary` > FROM `default`.`salaries` > WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL > GROUP BY `emp_no`) AS `t1` > ON `employees`.`emp_no` = `t1`.`emp_no` > ORDER BY `max_sal` DESC; > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-18375) Cannot ORDER by subquery fields unless they are selected
[ https://issues.apache.org/jira/browse/HIVE-18375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312344#comment-16312344 ] Pengcheng Xiong commented on HIVE-18375: [~pauljackson123], if possible, could u try Hive master? As this is a new feature in HIVE-15160 targeting version 3.0, I doubt it is available in any published version yet. > Cannot ORDER by subquery fields unless they are selected > > > Key: HIVE-18375 > URL: https://issues.apache.org/jira/browse/HIVE-18375 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 2.3.2 > Environment: Amazon AWS > Release label:emr-5.11.0 > Hadoop distribution:Amazon 2.7.3 > Applications:Hive 2.3.2, Pig 0.17.0, Hue 4.0.1 > classification=hive-site,properties=[hive.strict.checks.cartesian.product=false,hive.mapred.mode=nonstrict] >Reporter: Paul Jackson >Priority: Minor > > Give these tables: > {code:SQL} > CREATE TABLE employees ( > emp_no INT, > first_name VARCHAR(14), > last_name VARCHAR(16) > ); > insert into employees values > (1, 'Gottlob', 'Frege'), > (2, 'Bertrand', 'Russell'), > (3, 'Ludwig', 'Wittgenstein'); > CREATE TABLE salaries ( > emp_no INT, > salary INT, > from_date DATE, > to_date DATE > ); > insert into salaries values > (1, 10, '1900-01-01', '1900-01-31'), > (1, 18, '1900-09-01', '1900-09-30'), > (2, 15, '1940-03-01', '1950-01-01'), > (3, 20, '1920-01-01', '1950-01-01'); > {code} > This query returns the names of the employees ordered by their peak salary: > {code:SQL} > SELECT `employees`.`last_name`, `employees`.`first_name`, `t1`.`max_salary` > FROM `default`.`employees` > INNER JOIN > (SELECT `emp_no`, MAX(`salary`) `max_salary` > FROM `default`.`salaries` > WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL > GROUP BY `emp_no`) AS `t1` > ON `employees`.`emp_no` = `t1`.`emp_no` > ORDER BY `t1`.`max_salary` DESC; > {code} > However, this should still work even if the max_salary is not part of the > projection: > {code:SQL} > SELECT `employees`.`last_name`, `employees`.`first_name` > FROM `default`.`employees` > INNER JOIN > (SELECT `emp_no`, MAX(`salary`) `max_salary` > FROM `default`.`salaries` > WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL > GROUP BY `emp_no`) AS `t1` > ON `employees`.`emp_no` = `t1`.`emp_no` > ORDER BY `t1`.`max_salary` DESC; > {code} > However, that fails with this error: > {code} > Error while compiling statement: FAILED: SemanticException [Error 10004]: > line 9:9 Invalid table alias or column reference 't1': (possible column names > are: last_name, first_name) > {code} > FWIW, this also fails: > {code:SQL} > SELECT `employees`.`last_name`, `employees`.`first_name`, `t1`.`max_salary` > AS `max_sal` > FROM `default`.`employees` > INNER JOIN > (SELECT `emp_no`, MAX(`salary`) `max_salary` > FROM `default`.`salaries` > WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL > GROUP BY `emp_no`) AS `t1` > ON `employees`.`emp_no` = `t1`.`emp_no` > ORDER BY `t1`.`max_salary` DESC; > {code} > But this succeeds: > {code:SQL} > SELECT `employees`.`last_name`, `employees`.`first_name`, `t1`.`max_salary` > AS `max_sal` > FROM `default`.`employees` > INNER JOIN > (SELECT `emp_no`, MAX(`salary`) `max_salary` > FROM `default`.`salaries` > WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL > GROUP BY `emp_no`) AS `t1` > ON `employees`.`emp_no` = `t1`.`emp_no` > ORDER BY `max_sal` DESC; > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-18366) Update HBaseSerDe to use hbase.mapreduce.hfileoutputformat.table.name instead of hbase.table.name as the table name property
[ https://issues.apache.org/jira/browse/HIVE-18366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312309#comment-16312309 ] Hive QA commented on HIVE-18366: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 1s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 43s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 36s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 22s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 38s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 50s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 21s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 25s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 13s{color} | {color:red} itests/util: The patch generated 1 new + 11 unchanged - 0 fixed = 12 total (was 11) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 46s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 12s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 15m 35s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh | | git revision | master / 3f5148d | | Default Java | 1.8.0_111 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-8449/yetus/diff-checkstyle-itests_util.txt | | modules | C: hbase-handler hcatalog/webhcat/svr itests/hcatalog-unit itests/util U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-8449/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Update HBaseSerDe to use hbase.mapreduce.hfileoutputformat.table.name instead > of hbase.table.name as the table name property > > > Key: HIVE-18366 > URL: https://issues.apache.org/jira/browse/HIVE-18366 > Project: Hive > Issue Type: Sub-task > Components: HBase Handler >Affects Versions: 3.0.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-18366.1.patch > > > HBase 2.0 changes the table name property to > hbase.mapreduce.hfileoutputformat.table.name. HiveHFileOutputFormat is using > the new property name while HiveHBaseTableOutputFormat is not. If we create > the table as follows, HiveHBaseTableOutputFormat is used which still uses the > old property hbase.table.name. > {noformat} > create table hbase_table2(key int, val string) stored by > 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' with serdeproperties > ('hbase.columns.mapping' = ':key,cf:val') tblproperties > ('hbase.mapreduce.hfileoutputformat.table.name' = > 'positive_hbase_handler_bulk') > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18269) LLAP: Fast llap io with slow processing pipeline can lead to OOM
[ https://issues.apache.org/jira/browse/HIVE-18269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-18269: Status: Patch Available (was: Open) Done... I am trying to test it on cluster but the cluster I'm using is down > LLAP: Fast llap io with slow processing pipeline can lead to OOM > > > Key: HIVE-18269 > URL: https://issues.apache.org/jira/browse/HIVE-18269 > Project: Hive > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran >Assignee: Sergey Shelukhin > Attachments: HIVE-18269.01.patch, HIVE-18269.1.patch, > HIVE-18269.bad.patch, Screen Shot 2017-12-13 at 1.15.16 AM.png > > > pendingData linked list in Llap IO elevator (LlapRecordReader.java) may grow > indefinitely when Llap IO is faster than processing pipeline. Since we don't > have backpressure to slow down the IO, this can lead to indefinite growth of > pending data leading to severe GC pressure and eventually lead to OOM. > This specific instance of LLAP was running on HDFS on top of EBS volume > backed by SSD. The query that triggered this is issue was ANALYZE STATISTICS > .. FOR COLUMNS which also gather bitvectors. Fast IO and Slow processing case. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-18096) add a user-friendly show plan command
[ https://issues.apache.org/jira/browse/HIVE-18096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312300#comment-16312300 ] Sergey Shelukhin commented on HIVE-18096: - Some minor comments. +1 otherwise, I can commit after the update. > add a user-friendly show plan command > - > > Key: HIVE-18096 > URL: https://issues.apache.org/jira/browse/HIVE-18096 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Harish Jaiprakash > Attachments: HIVE-18096.01.patch, HIVE-18096.02.patch > > > For admin to be able to get an overview of a resource plan. > We need to try to do this using sysdb. > If that is not possible to do in a nice way, we'd do a text-based one like > query explain, or desc extended table. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-18269) LLAP: Fast llap io with slow processing pipeline can lead to OOM
[ https://issues.apache.org/jira/browse/HIVE-18269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312290#comment-16312290 ] Jason Dere commented on HIVE-18269: --- Looks ok I think .. can you submit the patch so we can see precommit test results? > LLAP: Fast llap io with slow processing pipeline can lead to OOM > > > Key: HIVE-18269 > URL: https://issues.apache.org/jira/browse/HIVE-18269 > Project: Hive > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran >Assignee: Sergey Shelukhin > Attachments: HIVE-18269.01.patch, HIVE-18269.1.patch, > HIVE-18269.bad.patch, Screen Shot 2017-12-13 at 1.15.16 AM.png > > > pendingData linked list in Llap IO elevator (LlapRecordReader.java) may grow > indefinitely when Llap IO is faster than processing pipeline. Since we don't > have backpressure to slow down the IO, this can lead to indefinite growth of > pending data leading to severe GC pressure and eventually lead to OOM. > This specific instance of LLAP was running on HDFS on top of EBS volume > backed by SSD. The query that triggered this is issue was ANALYZE STATISTICS > .. FOR COLUMNS which also gather bitvectors. Fast IO and Slow processing case. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-14615) Temp table leaves behind insert command
[ https://issues.apache.org/jira/browse/HIVE-14615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Sherman updated HIVE-14615: -- Attachment: HIVE-14615.3.patch > Temp table leaves behind insert command > --- > > Key: HIVE-14615 > URL: https://issues.apache.org/jira/browse/HIVE-14615 > Project: Hive > Issue Type: Bug > Components: Query Processor >Reporter: Chaoyu Tang >Assignee: Andrew Sherman > Attachments: HIVE-14615.1.patch, HIVE-14615.2.patch, > HIVE-14615.3.patch > > > {code} > create table test (key int, value string); > insert into test values (1, 'val1'); > show tables; > test > values__tmp__table__1 > {code} > the temp table values__tmp__table__1 was resulted from insert into ...values > and exists until logout the session. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-18349) Misc metastore changes for debuggability, error on commit txn failures
[ https://issues.apache.org/jira/browse/HIVE-18349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312285#comment-16312285 ] Hive QA commented on HIVE-18349: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12904687/HIVE-18349.4.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 610 failed/errored test(s), 11516 tests executed *Failed tests:* {noformat} TestMiniLlapLocalCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=165) [vector_interval_2.q,schema_evol_orc_acid_table_update.q,metadataonly1.q,auto_join_nulls.q,metadata_only_queries_with_filters.q,schema_evol_text_nonvec_part_all_complex.q,alter_merge_orc.q,vector_between_columns.q,vector_char_cast.q,vector_groupby_grouping_sets6.q,join_filters.q,udaf_collect_set_2.q,update_after_multiple_inserts.q,offset_limit_ppd_optimizer.q,materialized_view_describe.q,orc_merge_incompat1.q,vectorized_parquet_types.q,vector_windowing_gby2.q,explainanalyze_2.q,vectorization_15.q,union7.q,vectorization_nested_udf.q,vector_char_2.q,schema_evol_orc_acidvec_part.q,vector_groupby_3.q,materialized_view_create_rewrite_multi_db.q,acid_no_buckets.q,cbo_rp_gby.q,auto_sortmerge_join_9.q,vector_groupby_grouping_id2.q] org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[colstats_all_nulls] (batchId=245) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alterColumnStatsPart] (batchId=85) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alterColumnStats] (batchId=55) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_partition_update_status] (batchId=89) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_column_stats] (batchId=64) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_update_status] (batchId=79) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_update_status_disable_bitvector] (batchId=78) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[analyze_tbl_part] (batchId=48) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_deep_filters] (batchId=89) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_filter] (batchId=8) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_groupby2] (batchId=47) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_groupby] (batchId=48) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_join] (batchId=53) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_join_pkfk] (batchId=14) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_limit] (batchId=11) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_part] (batchId=15) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_select] (batchId=61) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_table] (batchId=21) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_union] (batchId=49) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[array_size_estimation] (batchId=58) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_10] (batchId=73) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_1] (batchId=22) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_2] (batchId=83) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_5] (batchId=41) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_5a] (batchId=53) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_9] (batchId=36) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join12] (batchId=24) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join13] (batchId=80) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join25] (batchId=72) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join_stats2] (batchId=86) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join_stats] (batchId=48) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join_without_localtask] (batchId=1) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[avro_decimal] (batchId=69) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[avro_decimal_native] (batchId=27) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bitvector] (batchId=82) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_SortUnionTransposeRule] (batchId=16) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_annotate_stats_groupby] (batchId=84) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_auto_join0] (batchId=15) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_auto_join1] (batchId=4) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[c
[jira] [Commented] (HIVE-18004) investigate deriving app name from JDBC connection for pool mapping
[ https://issues.apache.org/jira/browse/HIVE-18004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312276#comment-16312276 ] Sergey Shelukhin commented on HIVE-18004: - We will go with (2) - url arguments > investigate deriving app name from JDBC connection for pool mapping > --- > > Key: HIVE-18004 > URL: https://issues.apache.org/jira/browse/HIVE-18004 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > > There are some client info fields that popular apps (Tableau, etc) might > populate; this might allow us to map queries to pools based on an application > used. Need to take a look (see the doc for an example API we might look into) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-14498) Freshness period for query rewriting using materialized views
[ https://issues.apache.org/jira/browse/HIVE-14498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-14498: --- Attachment: HIVE-14498.04.patch > Freshness period for query rewriting using materialized views > - > > Key: HIVE-14498 > URL: https://issues.apache.org/jira/browse/HIVE-14498 > Project: Hive > Issue Type: Sub-task > Components: Materialized views >Affects Versions: 2.2.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-14498.01.patch, HIVE-14498.02.patch, > HIVE-14498.03.patch, HIVE-14498.04.patch, HIVE-14498.patch > > > Once we have query rewriting in place (HIVE-14496), one of the main issues is > data freshness in the materialized views. > Since we will not support view maintenance at first, we could include a > HiveConf property to configure a max freshness period (_n timeunits_). If a > query comes, and the materialized view has been populated (by create, > refresh, etc.) for a longer period than _n_, then we should not use it for > rewriting the query. > Optionally, we could print a warning for the user indicating that the > materialized was not used because it was not fresh. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-18367) Describe Extended output is truncated on a table with an explicit row format containing tabs or newlines.
[ https://issues.apache.org/jira/browse/HIVE-18367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312271#comment-16312271 ] Andrew Sherman commented on HIVE-18367: --- Test failures look unrelated to this change. [~pvary] please could you review? > Describe Extended output is truncated on a table with an explicit row format > containing tabs or newlines. > - > > Key: HIVE-18367 > URL: https://issues.apache.org/jira/browse/HIVE-18367 > Project: Hive > Issue Type: Bug >Reporter: Andrew Sherman >Assignee: Andrew Sherman > Attachments: HIVE-18367.1.patch > > > 'Describe Extended' dumps information about a table. The protocol for sending > this data relies on tabs and newlines to separate pieces of data. If a table > has 'FIELDS terminated by XXX' or 'LINES terminated by XXX' where XXX is a > tab or newline then the output seen by the user is prematurely truncated. Fix > this by replacing tabs and newlines in the table description with “\n” and > “\t”. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-18214) Flaky test: TestSparkClient
[ https://issues.apache.org/jira/browse/HIVE-18214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312261#comment-16312261 ] Aihua Xu commented on HIVE-18214: - [~stakiar] Try to understand the issue. So when the one test finishes, rpc is closing and it will close RemoteDriver to stop the SparkContext, but since it happens asynchronously, we don't know when it really shutdowns? One thought: maybe we should add the logic to always make sure there is only JavaSparkContext instance created in one JVM. If there is one existing and we try to create a new one, we can shutdown the existing one and create a new one. > Flaky test: TestSparkClient > --- > > Key: HIVE-18214 > URL: https://issues.apache.org/jira/browse/HIVE-18214 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Attachments: HIVE-18214.1.patch > > > Looks like there is a race condition in {{TestSparkClient#runTest}}. The test > creates a {{RemoteDriver}} in memory, which creates a {{JavaSparkContext}}. A > new {{JavaSparkContext}} is created for each test that is run. There is a > race condition where the {{RemoteDriver}} isn't given enough time to > shutdown, so when the next test starts running it creates another > {{JavaSparkContext}} which causes an exception like > {{org.apache.spark.SparkException: Only one SparkContext may be running in > this JVM (see SPARK-2243)}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-18375) Cannot ORDER by subquery fields unless they are selected
[ https://issues.apache.org/jira/browse/HIVE-18375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312255#comment-16312255 ] Pengcheng Xiong commented on HIVE-18375: May be related to HIVE-15160. > Cannot ORDER by subquery fields unless they are selected > > > Key: HIVE-18375 > URL: https://issues.apache.org/jira/browse/HIVE-18375 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 2.3.2 > Environment: Amazon AWS > Release label:emr-5.11.0 > Hadoop distribution:Amazon 2.7.3 > Applications:Hive 2.3.2, Pig 0.17.0, Hue 4.0.1 > classification=hive-site,properties=[hive.strict.checks.cartesian.product=false,hive.mapred.mode=nonstrict] >Reporter: Paul Jackson >Priority: Minor > > Give these tables: > {code:SQL} > CREATE TABLE employees ( > emp_no INT, > first_name VARCHAR(14), > last_name VARCHAR(16) > ); > insert into employees values > (1, 'Gottlob', 'Frege'), > (2, 'Bertrand', 'Russell'), > (3, 'Ludwig', 'Wittgenstein'); > CREATE TABLE salaries ( > emp_no INT, > salary INT, > from_date DATE, > to_date DATE > ); > insert into salaries values > (1, 10, '1900-01-01', '1900-01-31'), > (1, 18, '1900-09-01', '1900-09-30'), > (2, 15, '1940-03-01', '1950-01-01'), > (3, 20, '1920-01-01', '1950-01-01'); > {code} > This query returns the names of the employees ordered by their peak salary: > {code:SQL} > SELECT `employees`.`last_name`, `employees`.`first_name`, `t1`.`max_salary` > FROM `default`.`employees` > INNER JOIN > (SELECT `emp_no`, MAX(`salary`) `max_salary` > FROM `default`.`salaries` > WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL > GROUP BY `emp_no`) AS `t1` > ON `employees`.`emp_no` = `t1`.`emp_no` > ORDER BY `t1`.`max_salary` DESC; > {code} > However, this should still work even if the max_salary is not part of the > projection: > {code:SQL} > SELECT `employees`.`last_name`, `employees`.`first_name` > FROM `default`.`employees` > INNER JOIN > (SELECT `emp_no`, MAX(`salary`) `max_salary` > FROM `default`.`salaries` > WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL > GROUP BY `emp_no`) AS `t1` > ON `employees`.`emp_no` = `t1`.`emp_no` > ORDER BY `t1`.`max_salary` DESC; > {code} > However, that fails with this error: > {code} > Error while compiling statement: FAILED: SemanticException [Error 10004]: > line 9:9 Invalid table alias or column reference 't1': (possible column names > are: last_name, first_name) > {code} > FWIW, this also fails: > {code:SQL} > SELECT `employees`.`last_name`, `employees`.`first_name`, `t1`.`max_salary` > AS `max_sal` > FROM `default`.`employees` > INNER JOIN > (SELECT `emp_no`, MAX(`salary`) `max_salary` > FROM `default`.`salaries` > WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL > GROUP BY `emp_no`) AS `t1` > ON `employees`.`emp_no` = `t1`.`emp_no` > ORDER BY `t1`.`max_salary` DESC; > {code} > But this succeeds: > {code:SQL} > SELECT `employees`.`last_name`, `employees`.`first_name`, `t1`.`max_salary` > AS `max_sal` > FROM `default`.`employees` > INNER JOIN > (SELECT `emp_no`, MAX(`salary`) `max_salary` > FROM `default`.`salaries` > WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL > GROUP BY `emp_no`) AS `t1` > ON `employees`.`emp_no` = `t1`.`emp_no` > ORDER BY `max_sal` DESC; > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18275) add HS2-level WM metrics
[ https://issues.apache.org/jira/browse/HIVE-18275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-18275: Status: Patch Available (was: Open) > add HS2-level WM metrics > > > Key: HIVE-18275 > URL: https://issues.apache.org/jira/browse/HIVE-18275 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-18275.patch > > > E.g. time spent in pool queue. Some existing UIs use perflogger output, so we > should also include that. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18275) add HS2-level WM metrics
[ https://issues.apache.org/jira/browse/HIVE-18275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-18275: Attachment: HIVE-18275.patch A small patch. Turns out there isn't much to HS2 per query metrics beside perflogger at the moment ;) [~thejas] can you take a look? > add HS2-level WM metrics > > > Key: HIVE-18275 > URL: https://issues.apache.org/jira/browse/HIVE-18275 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-18275.patch > > > E.g. time spent in pool queue. Some existing UIs use perflogger output, so we > should also include that. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18349) Misc metastore changes for debuggability, error on commit txn failures
[ https://issues.apache.org/jira/browse/HIVE-18349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-18349: - Attachment: HIVE-18349.5.patch One minor fix, when we already throw MetaException (dropping default database for example) we will ignore the commit transaction failed MetaException. > Misc metastore changes for debuggability, error on commit txn failures > -- > > Key: HIVE-18349 > URL: https://issues.apache.org/jira/browse/HIVE-18349 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-18349.1.patch, HIVE-18349.2.patch, > HIVE-18349.3.patch, HIVE-18349.4.patch, HIVE-18349.5.patch > > > 1) Hive metastore audit event log/metastore log does not log the final status > (success or failed) of the event. Some operations like for example, > drop_table returns a boolean success flag but it never gets logged anywhere. > However the same is sent to end event listeners or other metastore event > listeners. It will be good to log the final status of the events. > 2) Make connection timeout when using connection pool configurable. Currently > its hard coded to 30 seconds. > 3) Provide a config to enable connection leak detection for HikariCP or > enable when debug logging is enabled. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18349) Misc metastore changes for debuggability, error on commit txn failures
[ https://issues.apache.org/jira/browse/HIVE-18349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-18349: - Attachment: (was: HIVE-18349.5.patch) > Misc metastore changes for debuggability, error on commit txn failures > -- > > Key: HIVE-18349 > URL: https://issues.apache.org/jira/browse/HIVE-18349 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-18349.1.patch, HIVE-18349.2.patch, > HIVE-18349.3.patch, HIVE-18349.4.patch, HIVE-18349.5.patch > > > 1) Hive metastore audit event log/metastore log does not log the final status > (success or failed) of the event. Some operations like for example, > drop_table returns a boolean success flag but it never gets logged anywhere. > However the same is sent to end event listeners or other metastore event > listeners. It will be good to log the final status of the events. > 2) Make connection timeout when using connection pool configurable. Currently > its hard coded to 30 seconds. > 3) Provide a config to enable connection leak detection for HikariCP or > enable when debug logging is enabled. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18349) Misc metastore changes for debuggability, error on commit txn failures
[ https://issues.apache.org/jira/browse/HIVE-18349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-18349: - Attachment: HIVE-18349.5.patch > Misc metastore changes for debuggability, error on commit txn failures > -- > > Key: HIVE-18349 > URL: https://issues.apache.org/jira/browse/HIVE-18349 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-18349.1.patch, HIVE-18349.2.patch, > HIVE-18349.3.patch, HIVE-18349.4.patch, HIVE-18349.5.patch > > > 1) Hive metastore audit event log/metastore log does not log the final status > (success or failed) of the event. Some operations like for example, > drop_table returns a boolean success flag but it never gets logged anywhere. > However the same is sent to end event listeners or other metastore event > listeners. It will be good to log the final status of the events. > 2) Make connection timeout when using connection pool configurable. Currently > its hard coded to 30 seconds. > 3) Provide a config to enable connection leak detection for HikariCP or > enable when debug logging is enabled. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-18349) Misc metastore changes for debuggability, error on commit txn failures
[ https://issues.apache.org/jira/browse/HIVE-18349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312227#comment-16312227 ] Hive QA commented on HIVE-18349: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 1s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 30s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 30s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 12s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 36s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 21s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 13s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 31s{color} | {color:red} standalone-metastore: The patch generated 35 new + 957 unchanged - 19 fixed = 992 total (was 976) {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 12s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 15m 3s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh | | git revision | master / 3f5148d | | Default Java | 1.8.0_111 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-8448/yetus/diff-checkstyle-standalone-metastore.txt | | whitespace | http://104.198.109.242/logs//PreCommit-HIVE-Build-8448/yetus/whitespace-eol.txt | | modules | C: standalone-metastore itests/hive-unit U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-8448/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Misc metastore changes for debuggability, error on commit txn failures > -- > > Key: HIVE-18349 > URL: https://issues.apache.org/jira/browse/HIVE-18349 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-18349.1.patch, HIVE-18349.2.patch, > HIVE-18349.3.patch, HIVE-18349.4.patch > > > 1) Hive metastore audit event log/metastore log does not log the final status > (success or failed) of the event. Some operations like for example, > drop_table returns a boolean success flag but it never gets logged anywhere. > However the same is sent to end event listeners or other metastore event > listeners. It will be good to log the final status of the events. > 2) Make connection timeout when using connection pool configurable. Currently > its hard coded to 30 seconds. > 3) Provide a config to enable connection leak detection for HikariCP or > enable when debug logging is enabled. -- This message was sent b
[jira] [Commented] (HIVE-18361) Extend shared work optimizer to reuse computation beyond work boundaries
[ https://issues.apache.org/jira/browse/HIVE-18361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312203#comment-16312203 ] Hive QA commented on HIVE-18361: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12904647/HIVE-18361.02.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 23 failed/errored test(s), 11091 tests executed *Failed tests:* {noformat} TestMiniLlapLocalCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=166) [materialized_view_create.q,schema_evol_orc_acid_part_update.q,orc_ppd_varchar.q,optimize_join_ptp.q,count_dist_rewrite.q,vector_nvl.q,join_nullsafe.q,vectorized_mapjoin.q,cross_prod_1.q,vectorized_shufflejoin.q,autoColumnStats_10.q,tez_smb_1.q,limit_pushdown.q,tez_vector_dynpart_hashjoin_1.q,vector_inner_join.q,subquery_notin.q,vector_coalesce_2.q,table_access_keys_stats.q,subquery_null_agg.q,filter_join_breaktask.q,mapjoin_decimal.q,column_table_stats.q,alter_merge_2_orc.q,columnstats_part_coltype.q,explainanalyze_2.q,union4.q,stats_based_fetch_decision.q,auto_sortmerge_join_10.q,extrapolate_part_stats_partial_ndv.q,vector_decimal_udf2.q] TestMiniLlapLocalCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=169) [join_is_not_distinct_from.q,tez_nway_join.q,tez_schema_evolution.q,bucket_map_join_tez1.q,vector_multi_insert.q,insert_update_delete.q,temp_table.q,cte_1.q,autoColumnStats_2.q,partition_pruning.q,vectorization_17.q,orc_merge8.q,orc_merge_incompat2.q,bucket_groupby.q,vector_outer_join4.q,vector_nullsafe_join.q,orc_merge7.q,bucketpruning1.q,schema_evol_orc_acidvec_table.q,vector_grouping_sets.q,vector_outer_join5.q,vector_groupby6.q,bucketmapjoin1.q,auto_sortmerge_join_5.q,auto_join0.q,load_dyn_part1.q,vector_windowing.q,schema_evol_orc_nonvec_part_all_primitive.q,auto_sortmerge_join_11.q,orc_merge_incompat_writer_version.q] TestNegativeCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=93) [udf_invalid.q,authorization_uri_export.q,druid_datasource2.q,view_update.q,default_partition_name.q,authorization_public_create.q,load_wrong_fileformat_rc_seq.q,altern1.q,describe_xpath1.q,drop_view_failure2.q,orc_replace_columns2_acid.q,temp_table_rename.q,invalid_select_column_with_subquery.q,udf_trunc_error1.q,insert_view_failure.q,dbtxnmgr_nodbunlock.q,authorization_show_columns.q,cte_recursion.q,load_part_nospec.q,clusterbyorderby.q,orc_type_promotion2.q,ctas_noperm_loc.q,duplicate_alias_in_transform.q,invalid_create_tbl2.q,part_col_complex_type.q,authorization_drop_db_empty.q,smb_mapjoin_14.q,subquery_scalar_multi_rows.q,alter_partition_coltype_2columns.q,subquery_corr_in_agg.q,authorization_show_grant_otheruser_wtab.q,regex_col_groupby.q,udaf_collect_set_unsupported.q,ptf_negative_DuplicateWindowAlias.q,exim_22_export_authfail.q,udf_likeany_wrong1.q,groupby_key.q,ambiguous_col.q,groupby3_multi_distinct.q,authorization_alter_drop_ptn.q,invalid_cast_from_binary_5.q,show_create_table_does_not_exist.q,exim_20_managed_location_over_existing.q,interval_3.q,authorization_compile.q,join35.q,merge_negative_3.q,udf_concat_ws_wrong3.q,create_or_replace_view8.q,split_sample_out_of_range.q,alter_concatenate_indexed_table.q,authorization_show_grant_otherrole.q,create_with_constraints_duplicate_name.q,invalid_stddev_samp_syntax.q,authorization_view_disable_cbo_7.q,autolocal1.q,analyze_view.q,exim_14_nonpart_part.q,avro_non_nullable_union.q,load_orc_negative_part.q,drop_view_failure1.q,columnstats_partlvl_invalid_values_autogather.q,exim_13_nonnative_import.q,alter_table_wrong_regex.q,add_partition_with_whitelist.q,udf_next_day_error_2.q,authorization_select.q,udf_trunc_error2.q,authorization_view_7.q,udf_format_number_wrong5.q,touch2.q,exim_03_nonpart_noncompat_colschema.q,orc_type_promotion1.q,lateral_view_alias.q,show_tables_bad_db1.q,unset_table_property.q,alter_non_native.q,nvl_mismatch_type.q,load_orc_negative3.q,authorization_create_role_no_admin.q,invalid_distinct1.q,authorization_grant_server.q,orc_type_promotion3_acid.q,show_tables_bad1.q,macro_unused_parameter.q,drop_invalid_constraint3.q,char_pad_convert_fail3.q,exim_23_import_exist_authfail.q,drop_invalid_constraint4.q,archive1.q,subquery_multiple_cols_in_select.q,drop_index_failure.q,change_hive_hdfs_session_path.q,udf_trunc_error3.q,invalid_variance_syntax.q,authorization_truncate_2.q,invalid_avg_syntax.q,invalid_select_column_with_tablename.q,mm_truncate_cols.q,groupby_grouping_sets1.q,druid_location.q,groupby2_multi_distinct.q,authorization_sba_drop_table.q,dynamic_partitions_with_whitelist.q,delete_non_acid_table.q,udf_greatest_error_2.q,create_with_constraints_validate.q,authorization_view_6.q,show_tablestatus.q,describe_xpath3.q,duplicate_alias_in_transform_schema.q,create_with_fk_uk_same_tab.q,authorization_create_tbl.q,udtf_not
[jira] [Commented] (HIVE-18214) Flaky test: TestSparkClient
[ https://issues.apache.org/jira/browse/HIVE-18214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312200#comment-16312200 ] Sahil Takiar commented on HIVE-18214: - [~pvary] thanks for taking a look. * In production, {{RemoteDriver}} is run in a dedicated container; however, we have some unit tests which run it in the local process; so in production its not really possible to hit this issue * I'm not a fan of exposing these methods publicly either, I can add a {{\@VisibleForTesting}} annotation; {{RemoteDriver}} is already marked as {{\@Private}} The only other way I can think of doing this is to change the {{TestSparkClient}} so it runs the {{RemoteDriver}} in a dedicated process (so similar to what we do in production). The test will take longer to run, but we won't hit this issue. > Flaky test: TestSparkClient > --- > > Key: HIVE-18214 > URL: https://issues.apache.org/jira/browse/HIVE-18214 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Attachments: HIVE-18214.1.patch > > > Looks like there is a race condition in {{TestSparkClient#runTest}}. The test > creates a {{RemoteDriver}} in memory, which creates a {{JavaSparkContext}}. A > new {{JavaSparkContext}} is created for each test that is run. There is a > race condition where the {{RemoteDriver}} isn't given enough time to > shutdown, so when the next test starts running it creates another > {{JavaSparkContext}} which causes an exception like > {{org.apache.spark.SparkException: Only one SparkContext may be running in > this JVM (see SPARK-2243)}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16484) Investigate SparkLauncher for HoS as alternative to bin/spark-submit
[ https://issues.apache.org/jira/browse/HIVE-16484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312178#comment-16312178 ] Sahil Takiar commented on HIVE-16484: - Test failures are un-related. I updated the RB and added a few notes to explain what the code is doing - https://reviews.apache.org/r/58684/ [~xuefuz], [~lirui] can you review? > Investigate SparkLauncher for HoS as alternative to bin/spark-submit > > > Key: HIVE-16484 > URL: https://issues.apache.org/jira/browse/HIVE-16484 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Attachments: HIVE-16484.1.patch, HIVE-16484.2.patch, > HIVE-16484.3.patch, HIVE-16484.4.patch, HIVE-16484.5.patch, > HIVE-16484.6.patch, HIVE-16484.7.patch, HIVE-16484.8.patch, HIVE-16484.9.patch > > > The {{SparkClientImpl#startDriver}} currently looks for the {{SPARK_HOME}} > directory and invokes the {{bin/spark-submit}} script, which spawns a > separate process to run the Spark application. > {{SparkLauncher}} was added in SPARK-4924 and is a programatic way to launch > Spark applications. > I see a few advantages: > * No need to spawn a separate process to launch a HoS --> lower startup time > * Simplifies the code in {{SparkClientImpl}} --> easier to debug > * {{SparkLauncher#startApplication}} returns a {{SparkAppHandle}} which > contains some useful utilities for querying the state of the Spark job > ** It also allows the launcher to specify a list of job listeners -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18052) Run p-tests on mm tables
[ https://issues.apache.org/jira/browse/HIVE-18052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom updated HIVE-18052: -- Attachment: HIVE-18052.16.patch patch 16 is the same as patch 156. Adding this since patch 15 is not tested. > Run p-tests on mm tables > > > Key: HIVE-18052 > URL: https://issues.apache.org/jira/browse/HIVE-18052 > Project: Hive > Issue Type: Task >Reporter: Steve Yeom >Assignee: Steve Yeom > Attachments: HIVE-18052.1.patch, HIVE-18052.10.patch, > HIVE-18052.11.patch, HIVE-18052.12.patch, HIVE-18052.13.patch, > HIVE-18052.14.patch, HIVE-18052.15.patch, HIVE-18052.16.patch, > HIVE-18052.2.patch, HIVE-18052.3.patch, HIVE-18052.4.patch, > HIVE-18052.5.patch, HIVE-18052.6.patch, HIVE-18052.7.patch, > HIVE-18052.8.patch, HIVE-18052.9.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (HIVE-18052) Run p-tests on mm tables
[ https://issues.apache.org/jira/browse/HIVE-18052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312174#comment-16312174 ] Steve Yeom edited comment on HIVE-18052 at 1/4/18 11:22 PM: patch 16 is the same as patch 156. Adding this since patch 15 is not run by the p-test system. was (Author: steveyeom2017): patch 16 is the same as patch 156. Adding this since patch 15 is not tested. > Run p-tests on mm tables > > > Key: HIVE-18052 > URL: https://issues.apache.org/jira/browse/HIVE-18052 > Project: Hive > Issue Type: Task >Reporter: Steve Yeom >Assignee: Steve Yeom > Attachments: HIVE-18052.1.patch, HIVE-18052.10.patch, > HIVE-18052.11.patch, HIVE-18052.12.patch, HIVE-18052.13.patch, > HIVE-18052.14.patch, HIVE-18052.15.patch, HIVE-18052.16.patch, > HIVE-18052.2.patch, HIVE-18052.3.patch, HIVE-18052.4.patch, > HIVE-18052.5.patch, HIVE-18052.6.patch, HIVE-18052.7.patch, > HIVE-18052.8.patch, HIVE-18052.9.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (HIVE-18052) Run p-tests on mm tables
[ https://issues.apache.org/jira/browse/HIVE-18052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312174#comment-16312174 ] Steve Yeom edited comment on HIVE-18052 at 1/4/18 11:22 PM: patch 16 is the same as patch 15. Adding this since patch 15 is not run by the p-test system. was (Author: steveyeom2017): patch 16 is the same as patch 156. Adding this since patch 15 is not run by the p-test system. > Run p-tests on mm tables > > > Key: HIVE-18052 > URL: https://issues.apache.org/jira/browse/HIVE-18052 > Project: Hive > Issue Type: Task >Reporter: Steve Yeom >Assignee: Steve Yeom > Attachments: HIVE-18052.1.patch, HIVE-18052.10.patch, > HIVE-18052.11.patch, HIVE-18052.12.patch, HIVE-18052.13.patch, > HIVE-18052.14.patch, HIVE-18052.15.patch, HIVE-18052.16.patch, > HIVE-18052.2.patch, HIVE-18052.3.patch, HIVE-18052.4.patch, > HIVE-18052.5.patch, HIVE-18052.6.patch, HIVE-18052.7.patch, > HIVE-18052.8.patch, HIVE-18052.9.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-18349) Misc metastore changes for debuggability, error on commit txn failures
[ https://issues.apache.org/jira/browse/HIVE-18349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312173#comment-16312173 ] Thejas M Nair commented on HIVE-18349: -- +1 pending tests > Misc metastore changes for debuggability, error on commit txn failures > -- > > Key: HIVE-18349 > URL: https://issues.apache.org/jira/browse/HIVE-18349 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-18349.1.patch, HIVE-18349.2.patch, > HIVE-18349.3.patch, HIVE-18349.4.patch > > > 1) Hive metastore audit event log/metastore log does not log the final status > (success or failed) of the event. Some operations like for example, > drop_table returns a boolean success flag but it never gets logged anywhere. > However the same is sent to end event listeners or other metastore event > listeners. It will be good to log the final status of the events. > 2) Make connection timeout when using connection pool configurable. Currently > its hard coded to 30 seconds. > 3) Provide a config to enable connection leak detection for HikariCP or > enable when debug logging is enabled. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18349) Misc metastore changes for debuggability, error on commit txn failures
[ https://issues.apache.org/jira/browse/HIVE-18349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-18349: - Summary: Misc metastore changes for debuggability, error on commit txn failures (was: Misc metastore changes for debuggability) > Misc metastore changes for debuggability, error on commit txn failures > -- > > Key: HIVE-18349 > URL: https://issues.apache.org/jira/browse/HIVE-18349 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-18349.1.patch, HIVE-18349.2.patch, > HIVE-18349.3.patch, HIVE-18349.4.patch > > > 1) Hive metastore audit event log/metastore log does not log the final status > (success or failed) of the event. Some operations like for example, > drop_table returns a boolean success flag but it never gets logged anywhere. > However the same is sent to end event listeners or other metastore event > listeners. It will be good to log the final status of the events. > 2) Make connection timeout when using connection pool configurable. Currently > its hard coded to 30 seconds. > 3) Provide a config to enable connection leak detection for HikariCP or > enable when debug logging is enabled. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18379) ALTER TABLE authorization_part SET PROPERTIES ("PARTITIONL_LEVEL_PRIVILEGE"="TRUE"); fails when authorization_part is MicroManaged table.
[ https://issues.apache.org/jira/browse/HIVE-18379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom updated HIVE-18379: -- Status: Patch Available (was: Open) > ALTER TABLE authorization_part SET PROPERTIES > ("PARTITIONL_LEVEL_PRIVILEGE"="TRUE"); fails when authorization_part is > MicroManaged table. > - > > Key: HIVE-18379 > URL: https://issues.apache.org/jira/browse/HIVE-18379 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 3.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Minor > Attachments: HIVE-18379.01.patch > > > ALTER TABLE authorization_part SET TBLPROPERTIES > ("PARTITION_LEVEL_PRIVILEGE"="TRUE") fails when authorization_part is a > Micromanaged table. > This is from authorization_2.q qtest. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18379) ALTER TABLE authorization_part SET PROPERTIES ("PARTITIONL_LEVEL_PRIVILEGE"="TRUE"); fails when authorization_part is MicroManaged table.
[ https://issues.apache.org/jira/browse/HIVE-18379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom updated HIVE-18379: -- Attachment: HIVE-18379.01.patch > ALTER TABLE authorization_part SET PROPERTIES > ("PARTITIONL_LEVEL_PRIVILEGE"="TRUE"); fails when authorization_part is > MicroManaged table. > - > > Key: HIVE-18379 > URL: https://issues.apache.org/jira/browse/HIVE-18379 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 3.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Minor > Attachments: HIVE-18379.01.patch > > > ALTER TABLE authorization_part SET TBLPROPERTIES > ("PARTITION_LEVEL_PRIVILEGE"="TRUE") fails when authorization_part is a > Micromanaged table. > This is from authorization_2.q qtest. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-18275) add HS2-level WM metrics
[ https://issues.apache.org/jira/browse/HIVE-18275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin reassigned HIVE-18275: --- Assignee: Sergey Shelukhin > add HS2-level WM metrics > > > Key: HIVE-18275 > URL: https://issues.apache.org/jira/browse/HIVE-18275 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > > E.g. time spent in pool queue. Some existing UIs use perflogger output, so we > should also include that. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-18269) LLAP: Fast llap io with slow processing pipeline can lead to OOM
[ https://issues.apache.org/jira/browse/HIVE-18269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312155#comment-16312155 ] Sergey Shelukhin commented on HIVE-18269: - [~prasanth_j] [~jdere] [~gopalv] can someone please review this patch? thnx > LLAP: Fast llap io with slow processing pipeline can lead to OOM > > > Key: HIVE-18269 > URL: https://issues.apache.org/jira/browse/HIVE-18269 > Project: Hive > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran >Assignee: Sergey Shelukhin > Attachments: HIVE-18269.01.patch, HIVE-18269.1.patch, > HIVE-18269.bad.patch, Screen Shot 2017-12-13 at 1.15.16 AM.png > > > pendingData linked list in Llap IO elevator (LlapRecordReader.java) may grow > indefinitely when Llap IO is faster than processing pipeline. Since we don't > have backpressure to slow down the IO, this can lead to indefinite growth of > pending data leading to severe GC pressure and eventually lead to OOM. > This specific instance of LLAP was running on HDFS on top of EBS volume > backed by SSD. The query that triggered this is issue was ANALYZE STATISTICS > .. FOR COLUMNS which also gather bitvectors. Fast IO and Slow processing case. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18377) avoid explicitly setting HIVE_SUPPORT_CONCURRENCY in JUnit tests
[ https://issues.apache.org/jira/browse/HIVE-18377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-18377: -- Status: Patch Available (was: Open) > avoid explicitly setting HIVE_SUPPORT_CONCURRENCY in JUnit tests > > > Key: HIVE-18377 > URL: https://issues.apache.org/jira/browse/HIVE-18377 > Project: Hive > Issue Type: Sub-task > Components: Test, Transactions >Reporter: Eugene Koifman >Assignee: Eugene Koifman > Attachments: HIVE-18377.02.patch > > > many UTs (e.g. TestHCatMultiOutputFormat, > BeelineWithHS2ConnectionFileTestBase, TestOperationLoggingAPIWithMr, > HCatBaseTest and many others) > explicitly set > {{hiveConf.set(HiveConf.ConfVars.HIVE_SUPPORT_CONCURRENCY.varname, "false");}} > It would be better if they picked up the settings from > data/conf/hive-site.xml. > It adds consistency and makes it possible to run all tests with known config > (at least approach this). > The outline of the process is: > 1. build copies {{\*-site.xml files from data/conf/\*\*/\*-site.xml}} to > target/testconf/ > 2. HiveConf picks up target/testconf/hive-site.xml > 3. Various forms of *CliDriver may explicitly specify (e.g. > MiniLlapLocalCliConfig) which hive-site.xml to use > > The first step is to see how many explicit settings of > HIVE_SUPPORT_CONCURRENCY can be removed w/o breaking the tests. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18377) avoid explicitly setting HIVE_SUPPORT_CONCURRENCY in JUnit tests
[ https://issues.apache.org/jira/browse/HIVE-18377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-18377: -- Attachment: HIVE-18377.02.patch > avoid explicitly setting HIVE_SUPPORT_CONCURRENCY in JUnit tests > > > Key: HIVE-18377 > URL: https://issues.apache.org/jira/browse/HIVE-18377 > Project: Hive > Issue Type: Sub-task > Components: Test, Transactions >Reporter: Eugene Koifman >Assignee: Eugene Koifman > Attachments: HIVE-18377.02.patch > > > many UTs (e.g. TestHCatMultiOutputFormat, > BeelineWithHS2ConnectionFileTestBase, TestOperationLoggingAPIWithMr, > HCatBaseTest and many others) > explicitly set > {{hiveConf.set(HiveConf.ConfVars.HIVE_SUPPORT_CONCURRENCY.varname, "false");}} > It would be better if they picked up the settings from > data/conf/hive-site.xml. > It adds consistency and makes it possible to run all tests with known config > (at least approach this). > The outline of the process is: > 1. build copies {{\*-site.xml files from data/conf/\*\*/\*-site.xml}} to > target/testconf/ > 2. HiveConf picks up target/testconf/hive-site.xml > 3. Various forms of *CliDriver may explicitly specify (e.g. > MiniLlapLocalCliConfig) which hive-site.xml to use > > The first step is to see how many explicit settings of > HIVE_SUPPORT_CONCURRENCY can be removed w/o breaking the tests. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-18349) Misc metastore changes for debuggability
[ https://issues.apache.org/jira/browse/HIVE-18349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312147#comment-16312147 ] Prasanth Jayachandran commented on HIVE-18349: -- [~thejas] Can you please take another look? some more changes went into this patch. > Misc metastore changes for debuggability > > > Key: HIVE-18349 > URL: https://issues.apache.org/jira/browse/HIVE-18349 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-18349.1.patch, HIVE-18349.2.patch, > HIVE-18349.3.patch, HIVE-18349.4.patch > > > 1) Hive metastore audit event log/metastore log does not log the final status > (success or failed) of the event. Some operations like for example, > drop_table returns a boolean success flag but it never gets logged anywhere. > However the same is sent to end event listeners or other metastore event > listeners. It will be good to log the final status of the events. > 2) Make connection timeout when using connection pool configurable. Currently > its hard coded to 30 seconds. > 3) Provide a config to enable connection leak detection for HikariCP or > enable when debug logging is enabled. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18349) Misc metastore changes for debuggability
[ https://issues.apache.org/jira/browse/HIVE-18349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-18349: - Attachment: HIVE-18349.4.patch Updated patch covers some more places where failure to commit will throw exception. > Misc metastore changes for debuggability > > > Key: HIVE-18349 > URL: https://issues.apache.org/jira/browse/HIVE-18349 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-18349.1.patch, HIVE-18349.2.patch, > HIVE-18349.3.patch, HIVE-18349.4.patch > > > 1) Hive metastore audit event log/metastore log does not log the final status > (success or failed) of the event. Some operations like for example, > drop_table returns a boolean success flag but it never gets logged anywhere. > However the same is sent to end event listeners or other metastore event > listeners. It will be good to log the final status of the events. > 2) Make connection timeout when using connection pool configurable. Currently > its hard coded to 30 seconds. > 3) Provide a config to enable connection leak detection for HikariCP or > enable when debug logging is enabled. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-18379) ALTER TABLE authorization_part SET PROPERTIES ("PARTITIONL_LEVEL_PRIVILEGE"="TRUE"); fails when authorization_part is MicroManaged table.
[ https://issues.apache.org/jira/browse/HIVE-18379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom reassigned HIVE-18379: - Assignee: Steve Yeom > ALTER TABLE authorization_part SET PROPERTIES > ("PARTITIONL_LEVEL_PRIVILEGE"="TRUE"); fails when authorization_part is > MicroManaged table. > - > > Key: HIVE-18379 > URL: https://issues.apache.org/jira/browse/HIVE-18379 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 3.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Minor > > ALTER TABLE authorization_part SET TBLPROPERTIES > ("PARTITION_LEVEL_PRIVILEGE"="TRUE") fails when authorization_part is a > Micromanaged table. > This is from authorization_2.q qtest. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-18361) Extend shared work optimizer to reuse computation beyond work boundaries
[ https://issues.apache.org/jira/browse/HIVE-18361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312134#comment-16312134 ] Hive QA commented on HIVE-18361: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 28s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 20s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 8s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 49s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 2s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 18s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 8s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 19s{color} | {color:red} common: The patch generated 2 new + 932 unchanged - 0 fixed = 934 total (was 932) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 34s{color} | {color:red} ql: The patch generated 19 new + 42 unchanged - 4 fixed = 61 total (was 46) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 12s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 15m 18s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh | | git revision | master / 3f5148d | | Default Java | 1.8.0_111 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-8447/yetus/diff-checkstyle-common.txt | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-8447/yetus/diff-checkstyle-ql.txt | | modules | C: common ql itests U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-8447/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Extend shared work optimizer to reuse computation beyond work boundaries > > > Key: HIVE-18361 > URL: https://issues.apache.org/jira/browse/HIVE-18361 > Project: Hive > Issue Type: New Feature > Components: Physical Optimizer >Affects Versions: 3.0.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Labels: TODOC3.0 > Attachments: HIVE-18361.01.patch, HIVE-18361.02.patch, > HIVE-18361.patch > > > Follow-up of the work in HIVE-16867. > HIVE-16867 introduced an optimization that identifies scans on input tables > that can be merged and reuses the computation that is done in the work > containing those scans. In particular, we traverse both parts of the plan > upstream and reuse the operators if possible. > Currently, the optimizer will not go beyond the output edge(s) of that work. > This extension removes that limitation. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18221) test acid default
[ https://issues.apache.org/jira/browse/HIVE-18221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-18221: -- Attachment: HIVE-18221.23.patch > test acid default > - > > Key: HIVE-18221 > URL: https://issues.apache.org/jira/browse/HIVE-18221 > Project: Hive > Issue Type: Test > Components: Transactions >Affects Versions: 3.0.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman > Attachments: HIVE-18221.01.patch, HIVE-18221.02.patch, > HIVE-18221.03.patch, HIVE-18221.04.patch, HIVE-18221.07.patch, > HIVE-18221.08.patch, HIVE-18221.09.patch, HIVE-18221.10.patch, > HIVE-18221.11.patch, HIVE-18221.12.patch, HIVE-18221.13.patch, > HIVE-18221.14.patch, HIVE-18221.16.patch, HIVE-18221.18.patch, > HIVE-18221.19.patch, HIVE-18221.20.patch, HIVE-18221.21.patch, > HIVE-18221.22.patch, HIVE-18221.23.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-18376) Update committer-list
[ https://issues.apache.org/jira/browse/HIVE-18376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312128#comment-16312128 ] Chris Drome commented on HIVE-18376: Thanks. Committed. > Update committer-list > - > > Key: HIVE-18376 > URL: https://issues.apache.org/jira/browse/HIVE-18376 > Project: Hive > Issue Type: Bug >Reporter: Chris Drome >Assignee: Chris Drome >Priority: Trivial > Attachments: HIVE-18376.1.patch > > > Adding new entry to committer-list: > {noformat} > + > +cdrome > +Chris Drome > + href="https://www.oath.com/";>Oath > + > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (HIVE-18376) Update committer-list
[ https://issues.apache.org/jira/browse/HIVE-18376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Drome resolved HIVE-18376. Resolution: Fixed > Update committer-list > - > > Key: HIVE-18376 > URL: https://issues.apache.org/jira/browse/HIVE-18376 > Project: Hive > Issue Type: Bug >Reporter: Chris Drome >Assignee: Chris Drome >Priority: Trivial > Attachments: HIVE-18376.1.patch > > > Adding new entry to committer-list: > {noformat} > + > +cdrome > +Chris Drome > + href="https://www.oath.com/";>Oath > + > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18368) Improve Spark Debug RDD Graph
[ https://issues.apache.org/jira/browse/HIVE-18368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Takiar updated HIVE-18368: Status: Patch Available (was: Open) > Improve Spark Debug RDD Graph > - > > Key: HIVE-18368 > URL: https://issues.apache.org/jira/browse/HIVE-18368 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Attachments: HIVE-18368.1.patch, Spark UI - Named RDDs.png > > > The {{SparkPlan}} class does some logging to show the mapping between > different {{SparkTran}}, what shuffle types are used, and what trans are > cached. However, there is room for improvement. > When debug logging is enabled the RDD graph is logged, but there isn't much > information printed about each RDD. > We should combine both of the graphs and improve them. We could even make the > Spark Plan graph part of the {{EXPLAIN EXTENDED}} output. > Ideally, the final graph shows a clear relationship between Tran objects, > RDDs, and BaseWorks. Edge should include information about number of > partitions, shuffle types, Spark operations used, etc. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18368) Improve Spark Debug RDD Graph
[ https://issues.apache.org/jira/browse/HIVE-18368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Takiar updated HIVE-18368: Attachment: HIVE-18368.1.patch > Improve Spark Debug RDD Graph > - > > Key: HIVE-18368 > URL: https://issues.apache.org/jira/browse/HIVE-18368 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Attachments: HIVE-18368.1.patch, Spark UI - Named RDDs.png > > > The {{SparkPlan}} class does some logging to show the mapping between > different {{SparkTran}}, what shuffle types are used, and what trans are > cached. However, there is room for improvement. > When debug logging is enabled the RDD graph is logged, but there isn't much > information printed about each RDD. > We should combine both of the graphs and improve them. We could even make the > Spark Plan graph part of the {{EXPLAIN EXTENDED}} output. > Ideally, the final graph shows a clear relationship between Tran objects, > RDDs, and BaseWorks. Edge should include information about number of > partitions, shuffle types, Spark operations used, etc. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-18328) Improve schematool validator to report duplicate rows for column statistics
[ https://issues.apache.org/jira/browse/HIVE-18328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312111#comment-16312111 ] Hive QA commented on HIVE-18328: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12903348/HIVE-18328.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 19 failed/errored test(s), 11547 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join25] (batchId=72) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mapjoin_hook] (batchId=12) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=35) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucketsortoptimize_insert_2] (batchId=151) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[hybridgrace_hashjoin_2] (batchId=156) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=164) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] (batchId=168) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast] (batchId=159) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] (batchId=159) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[bucketizedhiveinputformat] (batchId=177) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[authorization_part] (batchId=93) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[stats_aggregator_error_1] (batchId=93) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] (batchId=120) org.apache.hadoop.hive.metastore.TestEmbeddedHiveMetaStore.testTransactionalValidation (batchId=213) org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=253) org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints (batchId=225) org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=231) org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=231) org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=231) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8446/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8446/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8446/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 19 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12903348 - PreCommit-HIVE-Build > Improve schematool validator to report duplicate rows for column statistics > --- > > Key: HIVE-18328 > URL: https://issues.apache.org/jira/browse/HIVE-18328 > Project: Hive > Issue Type: Improvement > Components: Hive >Affects Versions: 2.1.1 >Reporter: Naveen Gangam >Assignee: Naveen Gangam > Attachments: HIVE-18328.patch > > > By design, in the {{TAB_COL_STATS}} table of the HMS schema, there should be > ONE AND ONLY ONE row, representing its statistics, for each column defined in > hive. A combination of DB_NAME, TABLE_NAME and COLUMN_NAME constitute a > primary key/unique row. > Each time the statistics are computed for a column, this row is updated. > However, if somehow via BDR/replication process, we end up with multiple > rows in this table for a given column, HMS server to recompute the statistics > there after. > So it would be good to detect this data anamoly via the schema validation > tool. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18368) Improve Spark Debug RDD Graph
[ https://issues.apache.org/jira/browse/HIVE-18368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Takiar updated HIVE-18368: Attachment: Spark UI - Named RDDs.png > Improve Spark Debug RDD Graph > - > > Key: HIVE-18368 > URL: https://issues.apache.org/jira/browse/HIVE-18368 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Attachments: Spark UI - Named RDDs.png > > > The {{SparkPlan}} class does some logging to show the mapping between > different {{SparkTran}}, what shuffle types are used, and what trans are > cached. However, there is room for improvement. > When debug logging is enabled the RDD graph is logged, but there isn't much > information printed about each RDD. > We should combine both of the graphs and improve them. We could even make the > Spark Plan graph part of the {{EXPLAIN EXTENDED}} output. > Ideally, the final graph shows a clear relationship between Tran objects, > RDDs, and BaseWorks. Edge should include information about number of > partitions, shuffle types, Spark operations used, etc. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-18368) Improve Spark Debug RDD Graph
[ https://issues.apache.org/jira/browse/HIVE-18368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312109#comment-16312109 ] Sahil Takiar commented on HIVE-18368: - * Spark provides a nice RDD graph via {{RDD#toDebugString}} - I replaced the {{SparkPlan#logSparkPlan}} and {{SparkUtilities#rddGraphToString}} with this graph. It includes all the info from both of these graphs + more info. It's very similar to the info that is showed in the Spark Web UI. An example is below. * Added explicit names for each RDD; the name is derived from the name of the {{BaseWork}} that corresponds to the RDD, along with the {{SparkEdgeProperty}} (if there is one). The example below shows this in detail. ** The nice thing about adding explicit names is that they show up in Spark Web UI too, which can be very useful for mapping a Hive Explain Plan to the Spark RDD DAG ** The name includes the number of partitions for the RDD as well as whether or not the RDD is cached * I originally wanted to find a way to display this in the {{EXPLAIN EXTENDED}} output, but for now that may be a bit difficult, because the {{SparkPlan}} is only generated in the {{RemoteDriver}} - its probably possible to generate the {{SparkPlan}} somewhere in the {{ExplainTask}}, but I'll save that for a later JIRA * The Spark RDD Graph is printed at INFO level, which I think should help with debugging * I've attached a screenshot of what the the Spark Web UI looks like with named RDDs Spark RDD Graph: {code} (1) Reducer 5 (1) MapPartitionsRDD[25] at mapPartitionsToPair at ReduceTran.java:41 [] | Reducer 5 (SORT, 1) ShuffledRDD[24] at sortByKey at SortByShuffler.java:51 [] +-(166) Reducer 4 (166) MapPartitionsRDD[23] at mapPartitionsToPair at ReduceTran.java:41 [] | Reducer 4 (PARTITION-LEVEL SORT, 166) ShuffledRDD[22] at repartitionAndSortWithinPartitions at SortByShuffler.java:57 [] +-(328) UnionRDD (328) UnionRDD[21] at union at SparkPlan.java:70 [] | Reducer 3 (328) MapPartitionsRDD[19] at mapPartitionsToPair at ReduceTran.java:41 [] | Reducer 3 (PARTITION-LEVEL SORT, 328) ShuffledRDD[18] at repartitionAndSortWithinPartitions at SortByShuffler.java:57 [] +-(874) UnionRDD (874) UnionRDD[17] at union at SparkPlan.java:70 [] | UnionRDD (874) UnionRDD[16] at union at SparkPlan.java:70 [] | Reducer 2 (437) MapPartitionsRDD[11] at mapPartitionsToPair at ReduceTran.java:41 [] | Reducer 2 (GROUP, 437) MapPartitionsRDD[10] at groupByKey at GroupByShuffler.java:31 [] | ShuffledRDD[9] at groupByKey at GroupByShuffler.java:31 [] +-(0) Map 1 (0) MapPartitionsRDD[8] at mapPartitionsToPair at MapTran.java:41 [] | Map 1 (store_sales, 0) HadoopRDD[4] at hadoopRDD at SparkPlanGenerator.java:203 [] | Reducer 8 (437) MapPartitionsRDD[14] at mapPartitionsToPair at ReduceTran.java:41 [] | Reducer 8 (GROUP PARTITION-LEVEL SORT, 437) ShuffledRDD[13] at repartitionAndSortWithinPartitions at SortByShuffler.java:57 [] +-(0) Map 7 (0) MapPartitionsRDD[12] at mapPartitionsToPair at MapTran.java:41 [] | Map 7 (store_sales, 0) HadoopRDD[5] at hadoopRDD at SparkPlanGenerator.java:203 [] | Map 10 (0) MapPartitionsRDD[15] at mapPartitionsToPair at MapTran.java:41 [] | Map 10 (store, 0) HadoopRDD[6] at hadoopRDD at SparkPlanGenerator.java:203 [] | Map 11 (0) MapPartitionsRDD[20] at mapPartitionsToPair at MapTran.java:41 [] | Map 11 (item, 0) HadoopRDD[7] at hadoopRDD at SparkPlanGenerator.java:203 [] {code} > Improve Spark Debug RDD Graph > - > > Key: HIVE-18368 > URL: https://issues.apache.org/jira/browse/HIVE-18368 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Attachments: Spark UI - Named RDDs.png > > > The {{SparkPlan}} class does some logging to show the mapping between > different {{SparkTran}}, what shuffle types are used, and what trans are > cached. However, there is room for improvement. > When debug logging is enabled the RDD graph is logged, but there isn't much > information printed about each RDD. > We should combine both of the graphs and improve them. We could even make the > Spark Plan graph part of the {{EXPLAIN EXTENDED}} output. > Ideally, the final graph shows a clear relationship between Tran objects, > RDDs, and BaseWorks. Edge should include information about number of > partitions, shuffle types, Spark operations used, etc. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-18376) Update committer-list
[ https://issues.apache.org/jira/browse/HIVE-18376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312106#comment-16312106 ] Mithun Radhakrishnan commented on HIVE-18376: - +1. Welcome to the fold! :) > Update committer-list > - > > Key: HIVE-18376 > URL: https://issues.apache.org/jira/browse/HIVE-18376 > Project: Hive > Issue Type: Bug >Reporter: Chris Drome >Assignee: Chris Drome >Priority: Trivial > Attachments: HIVE-18376.1.patch > > > Adding new entry to committer-list: > {noformat} > + > +cdrome > +Chris Drome > + href="https://www.oath.com/";>Oath > + > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18368) Improve Spark Debug RDD Graph
[ https://issues.apache.org/jira/browse/HIVE-18368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Takiar updated HIVE-18368: Summary: Improve Spark Debug RDD Graph (was: Improve SparkPlan Graph) > Improve Spark Debug RDD Graph > - > > Key: HIVE-18368 > URL: https://issues.apache.org/jira/browse/HIVE-18368 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar > > The {{SparkPlan}} class does some logging to show the mapping between > different {{SparkTran}}, what shuffle types are used, and what trans are > cached. However, there is room for improvement. > When debug logging is enabled the RDD graph is logged, but there isn't much > information printed about each RDD. > We should combine both of the graphs and improve them. We could even make the > Spark Plan graph part of the {{EXPLAIN EXTENDED}} output. > Ideally, the final graph shows a clear relationship between Tran objects, > RDDs, and BaseWorks. Edge should include information about number of > partitions, shuffle types, Spark operations used, etc. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-18378) Explain plan should show if a Map/Reduce Work is being cached
[ https://issues.apache.org/jira/browse/HIVE-18378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312093#comment-16312093 ] Sahil Takiar commented on HIVE-18378: - The {{CombineEquivalentWorkResolver}} should also print something in the log everytime it decides to combine two work objets. Right now it doesn't print anything at the INFO level. > Explain plan should show if a Map/Reduce Work is being cached > - > > Key: HIVE-18378 > URL: https://issues.apache.org/jira/browse/HIVE-18378 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Sahil Takiar > > It would be nice if the explain plan showed what {{MapWork}} / {{ReduceWork}} > objects are being cached by Spark. > The {{CombineEquivalentWorkResolver}} is the only code that triggers Spark > cache-ing, so we should be able to modify it so that it displays if a work > object will be cached or not. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16601) Display Session Id and Query Name / Id in Spark UI
[ https://issues.apache.org/jira/browse/HIVE-16601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Takiar updated HIVE-16601: Issue Type: Sub-task (was: Bug) Parent: HIVE-17718 > Display Session Id and Query Name / Id in Spark UI > -- > > Key: HIVE-16601 > URL: https://issues.apache.org/jira/browse/HIVE-16601 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Labels: TODOC3.0 > Fix For: 3.0.0 > > Attachments: HIVE-16601.1.patch, HIVE-16601.2.patch, > HIVE-16601.3.patch, HIVE-16601.4.patch, HIVE-16601.5.patch, > HIVE-16601.6.patch, HIVE-16601.7.patch, HIVE-16601.8.patch, Spark UI > Applications List.png, Spark UI Jobs List.png > > > We should display the session id for each HoS Application Launched, and the > Query Name / Id and Dag Id for each Spark job launched. Hive-on-MR does > something similar via the {{mapred.job.name}} parameter. The query name is > displayed in the Job Name of the MR app. > The changes here should also allow us to leverage the config > {{hive.query.name}} for HoS. > This should help with debuggability of HoS applications. The Hive-on-Tez UI > does something similar. > Related issues for Hive-on-Tez: HIVE-12357, HIVE-12523 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-18377) avoid explicitly setting HIVE_SUPPORT_CONCURRENCY in JUnit tests
[ https://issues.apache.org/jira/browse/HIVE-18377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman reassigned HIVE-18377: - > avoid explicitly setting HIVE_SUPPORT_CONCURRENCY in JUnit tests > > > Key: HIVE-18377 > URL: https://issues.apache.org/jira/browse/HIVE-18377 > Project: Hive > Issue Type: Sub-task > Components: Test, Transactions >Reporter: Eugene Koifman >Assignee: Eugene Koifman > > many UTs (e.g. TestHCatMultiOutputFormat, > BeelineWithHS2ConnectionFileTestBase, TestOperationLoggingAPIWithMr, > HCatBaseTest and many others) > explicitly set > {{hiveConf.set(HiveConf.ConfVars.HIVE_SUPPORT_CONCURRENCY.varname, "false");}} > It would be better if they picked up the settings from > data/conf/hive-site.xml. > It adds consistency and makes it possible to run all tests with known config > (at least approach this). > The outline of the process is: > 1. build copies {{\*-site.xml files from data/conf/\*\*/\*-site.xml}} to > target/testconf/ > 2. HiveConf picks up target/testconf/hive-site.xml > 3. Various forms of *CliDriver may explicitly specify (e.g. > MiniLlapLocalCliConfig) which hive-site.xml to use > > The first step is to see how many explicit settings of > HIVE_SUPPORT_CONCURRENCY can be removed w/o breaking the tests. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18074) do not show rejected tasks as killed in query UI
[ https://issues.apache.org/jira/browse/HIVE-18074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-18074: Issue Type: Task (was: Sub-task) Parent: (was: HIVE-17481) > do not show rejected tasks as killed in query UI > > > Key: HIVE-18074 > URL: https://issues.apache.org/jira/browse/HIVE-18074 > Project: Hive > Issue Type: Task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > > Tasks rejected from LLAP because the cluster is full are shown as killed > tasks in the commandline query UI (CLI and beeline). This shouldn't really > happen; killed tasks in the container case means something else, and this > scenario doesn't exist because AM doesn't continuously try to queue tasks. We > could change LLAP queue to use sort of a pull model (would also allow for > better duplicate scheduling), but for now we should fix the UI -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18238) Driver execution may not have configuration changing sideeffects
[ https://issues.apache.org/jira/browse/HIVE-18238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich updated HIVE-18238: Attachment: HIVE-18238.04wip01.patch I can't seem to reproduce these test failures...If I run the {{TestNegativeCliDriver}} at the current master it also starts fails with OOM errors... - however I've ensured that the {{Driver.destroy}} is called - it was called before after every failed command... > Driver execution may not have configuration changing sideeffects > - > > Key: HIVE-18238 > URL: https://issues.apache.org/jira/browse/HIVE-18238 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich > Attachments: HIVE-18238.01wip01.patch, HIVE-18238.02.patch, > HIVE-18238.03.patch, HIVE-18238.04wip01.patch > > > {{Driver}} executes sql statements which use "hiveconf" settings; > but the {{Driver}} itself may *not* change the configuration... > I've found an example; which shows how hazardous this is... > {code} > set hive.mapred.mode=strict; > select "${hiveconf:hive.mapred.mode}"; > create table t (a int); > analyze table t compute statistics; > select "${hiveconf:hive.mapred.mode}"; > {code} > currently; the last select returns {{nonstrict}} because of > [this|https://github.com/apache/hive/blob/7ddd915bf82a68c8ab73b0c4ca409f1a6d43d227/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java#L1696] -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-18375) Cannot ORDER by subquery fields unless they are selected
[ https://issues.apache.org/jira/browse/HIVE-18375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312051#comment-16312051 ] Paul Jackson commented on HIVE-18375: - This seems related. Posting as a comment, but perhaps it should be in its own bug report. Order By cannot see fields if they are projected with an alias. The first two queries fail with: {code}SemanticException [Error 10004]: line 7:9 Invalid table alias or column reference 'emp_no': (possible column names are: f_4, f_3, f_5){code} The last two succeed. {code:SQL} SELECT `first_name` `F_4`, `last_name` `F_5` FROM `default`.`employees` ORDER BY `emp_no` DESC; SELECT `first_name` `F_4`, `emp_no` `F_3`, `last_name` `F_5` FROM `default`.`employees` ORDER BY `emp_no` DESC; SELECT `first_name` `F_4`, `emp_no`, `last_name` `F_5` FROM `default`.`employees` ORDER BY `emp_no` DESC; SELECT `first_name` `F_4`, `emp_no` `F_3`, `last_name` `F_5` FROM `default`.`employees` ORDER BY `F_3` DESC; {code} > Cannot ORDER by subquery fields unless they are selected > > > Key: HIVE-18375 > URL: https://issues.apache.org/jira/browse/HIVE-18375 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 2.3.2 > Environment: Amazon AWS > Release label:emr-5.11.0 > Hadoop distribution:Amazon 2.7.3 > Applications:Hive 2.3.2, Pig 0.17.0, Hue 4.0.1 > classification=hive-site,properties=[hive.strict.checks.cartesian.product=false,hive.mapred.mode=nonstrict] >Reporter: Paul Jackson >Priority: Minor > > Give these tables: > {code:SQL} > CREATE TABLE employees ( > emp_no INT, > first_name VARCHAR(14), > last_name VARCHAR(16) > ); > insert into employees values > (1, 'Gottlob', 'Frege'), > (2, 'Bertrand', 'Russell'), > (3, 'Ludwig', 'Wittgenstein'); > CREATE TABLE salaries ( > emp_no INT, > salary INT, > from_date DATE, > to_date DATE > ); > insert into salaries values > (1, 10, '1900-01-01', '1900-01-31'), > (1, 18, '1900-09-01', '1900-09-30'), > (2, 15, '1940-03-01', '1950-01-01'), > (3, 20, '1920-01-01', '1950-01-01'); > {code} > This query returns the names of the employees ordered by their peak salary: > {code:SQL} > SELECT `employees`.`last_name`, `employees`.`first_name`, `t1`.`max_salary` > FROM `default`.`employees` > INNER JOIN > (SELECT `emp_no`, MAX(`salary`) `max_salary` > FROM `default`.`salaries` > WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL > GROUP BY `emp_no`) AS `t1` > ON `employees`.`emp_no` = `t1`.`emp_no` > ORDER BY `t1`.`max_salary` DESC; > {code} > However, this should still work even if the max_salary is not part of the > projection: > {code:SQL} > SELECT `employees`.`last_name`, `employees`.`first_name` > FROM `default`.`employees` > INNER JOIN > (SELECT `emp_no`, MAX(`salary`) `max_salary` > FROM `default`.`salaries` > WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL > GROUP BY `emp_no`) AS `t1` > ON `employees`.`emp_no` = `t1`.`emp_no` > ORDER BY `t1`.`max_salary` DESC; > {code} > However, that fails with this error: > {code} > Error while compiling statement: FAILED: SemanticException [Error 10004]: > line 9:9 Invalid table alias or column reference 't1': (possible column names > are: last_name, first_name) > {code} > FWIW, this also fails: > {code:SQL} > SELECT `employees`.`last_name`, `employees`.`first_name`, `t1`.`max_salary` > AS `max_sal` > FROM `default`.`employees` > INNER JOIN > (SELECT `emp_no`, MAX(`salary`) `max_salary` > FROM `default`.`salaries` > WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL > GROUP BY `emp_no`) AS `t1` > ON `employees`.`emp_no` = `t1`.`emp_no` > ORDER BY `t1`.`max_salary` DESC; > {code} > But this succeeds: > {code:SQL} > SELECT `employees`.`last_name`, `employees`.`first_name`, `t1`.`max_salary` > AS `max_sal` > FROM `default`.`employees` > INNER JOIN > (SELECT `emp_no`, MAX(`salary`) `max_salary` > FROM `default`.`salaries` > WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL > GROUP BY `emp_no`) AS `t1` > ON `employees`.`emp_no` = `t1`.`emp_no` > ORDER BY `max_sal` DESC; > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18376) Update committer-list
[ https://issues.apache.org/jira/browse/HIVE-18376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Drome updated HIVE-18376: --- Attachment: HIVE-18376.1.patch > Update committer-list > - > > Key: HIVE-18376 > URL: https://issues.apache.org/jira/browse/HIVE-18376 > Project: Hive > Issue Type: Bug >Reporter: Chris Drome >Assignee: Chris Drome >Priority: Trivial > Attachments: HIVE-18376.1.patch > > > Adding new entry to committer-list: > {noformat} > + > +cdrome > +Chris Drome > + href="https://www.oath.com/";>Oath > + > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-18328) Improve schematool validator to report duplicate rows for column statistics
[ https://issues.apache.org/jira/browse/HIVE-18328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312040#comment-16312040 ] Hive QA commented on HIVE-18328: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 1s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 25s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 14s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 11s{color} | {color:red} beeline: The patch generated 6 new + 88 unchanged - 0 fixed = 94 total (was 88) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 13s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 12m 1s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh | | git revision | master / 3f5148d | | Default Java | 1.8.0_111 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-8446/yetus/diff-checkstyle-beeline.txt | | modules | C: beeline itests/hive-unit U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-8446/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Improve schematool validator to report duplicate rows for column statistics > --- > > Key: HIVE-18328 > URL: https://issues.apache.org/jira/browse/HIVE-18328 > Project: Hive > Issue Type: Improvement > Components: Hive >Affects Versions: 2.1.1 >Reporter: Naveen Gangam >Assignee: Naveen Gangam > Attachments: HIVE-18328.patch > > > By design, in the {{TAB_COL_STATS}} table of the HMS schema, there should be > ONE AND ONLY ONE row, representing its statistics, for each column defined in > hive. A combination of DB_NAME, TABLE_NAME and COLUMN_NAME constitute a > primary key/unique row. > Each time the statistics are computed for a column, this row is updated. > However, if somehow via BDR/replication process, we end up with multiple > rows in this table for a given column, HMS server to recompute the statistics > there after. > So it would be good to detect this data anamoly via the schema validation > tool. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-18376) Update committer-list
[ https://issues.apache.org/jira/browse/HIVE-18376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Drome reassigned HIVE-18376: -- > Update committer-list > - > > Key: HIVE-18376 > URL: https://issues.apache.org/jira/browse/HIVE-18376 > Project: Hive > Issue Type: Bug >Reporter: Chris Drome >Assignee: Chris Drome >Priority: Trivial > > Adding new entry to committer-list: > {noformat} > + > +cdrome > +Chris Drome > + href="https://www.oath.com/";>Oath > + > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18334) Cannot JOIN ON result of COALESCE
[ https://issues.apache.org/jira/browse/HIVE-18334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Jackson updated HIVE-18334: Component/s: Query Processor > Cannot JOIN ON result of COALESCE > -- > > Key: HIVE-18334 > URL: https://issues.apache.org/jira/browse/HIVE-18334 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 2.3.2 > Environment: Amazon AWS > Release label:emr-5.11.0 > Hadoop distribution:Amazon 2.7.3 > Applications:Hive 2.3.2, Pig 0.17.0, Hue 4.0.1 > classification=hive-site,properties=[hive.strict.checks.cartesian.product=false,hive.mapred.mode=nonstrict] >Reporter: Paul Jackson >Priority: Minor > > A join is returning no results when the ON clause is equating the results of > two COALESCE functions. To reproduce: > {code:SQL} > CREATE TABLE t5 ( > dno INTEGER, > dname VARCHAR(30), > eno INTEGER, > ename VARCHAR(30)); > CREATE TABLE t6 ( > dno INTEGER, > dname VARCHAR(30), > eno INTEGER, > ename VARCHAR(30)); > INSERT INTO t5 VALUES > (10, 'FOO', NULL, NULL), > (20, 'BAR', NULL, NULL), > (NULL, NULL, 7300, 'LARRY'), > (NULL, NULL, 7400, 'MOE'), > (NULL, NULL, 7500, 'CURLY'); > INSERT INTO t6 VALUES > (10, 'LENNON', NULL, NULL), > (20, 'MCCARTNEY', NULL, NULL), > (NULL, NULL, 7300, 'READY'), > (NULL, NULL, 7400, 'WILLING'), > (NULL, NULL, 7500, 'ABLE'); > -- Fails with 0 results > SELECT * > FROM t5 > INNER JOIN t6 > ON COALESCE(`t5`.`eno`, `t5`.`dno`) = COALESCE(`t6`.`eno`, `t6`.`dno`) > -- Full cross with where clause works (in nonstrict mode), returning 5 results > SELECT * > FROM t5 > JOIN t6 > WHERE `t5`.`eno` = `t6`.`eno` OR `t5`.`dno` = `t6`.`dno` > -- Strange that coalescing the same field returns 2 results... > SELECT * > FROM t5 > INNER JOIN t6 > ON COALESCE(`t5`.`dno`, `t5`.`dno`) = COALESCE(`t6`.`dno`, `t6`.`dno`) > -- ...and coalescing the other field returns 3 results > SELECT * > FROM t5 > INNER JOIN t6 > ON COALESCE(`t5`.`eno`, `t5`.`eno`) = COALESCE(`t6`.`eno`, `t6`.`eno`) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (HIVE-18374) Update committer-list
[ https://issues.apache.org/jira/browse/HIVE-18374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan resolved HIVE-18374. - Resolution: Fixed Thanks, [~thejas]! :] Committed. > Update committer-list > - > > Key: HIVE-18374 > URL: https://issues.apache.org/jira/browse/HIVE-18374 > Project: Hive > Issue Type: Task >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan >Priority: Trivial > Attachments: HIVE-18374.1.patch > > > I'm afraid I need to make a trivial change to my organization affiliation: > {code:xml} > > mithun > Mithun Radhakrishnan > https://oath.com/";>Oath > > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18375) Cannot ORDER by subquery fields unless they are selected
[ https://issues.apache.org/jira/browse/HIVE-18375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Jackson updated HIVE-18375: Description: Give these tables: {code:SQL} CREATE TABLE employees ( emp_no INT, first_name VARCHAR(14), last_name VARCHAR(16) ); insert into employees values (1, 'Gottlob', 'Frege'), (2, 'Bertrand', 'Russell'), (3, 'Ludwig', 'Wittgenstein'); CREATE TABLE salaries ( emp_no INT, salary INT, from_date DATE, to_date DATE ); insert into salaries values (1, 10, '1900-01-01', '1900-01-31'), (1, 18, '1900-09-01', '1900-09-30'), (2, 15, '1940-03-01', '1950-01-01'), (3, 20, '1920-01-01', '1950-01-01'); {code} This query returns the names of the employees ordered by their peak salary: {code:SQL} SELECT `employees`.`last_name`, `employees`.`first_name`, `t1`.`max_salary` FROM `default`.`employees` INNER JOIN (SELECT `emp_no`, MAX(`salary`) `max_salary` FROM `default`.`salaries` WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL GROUP BY `emp_no`) AS `t1` ON `employees`.`emp_no` = `t1`.`emp_no` ORDER BY `t1`.`max_salary` DESC; {code} However, this should still work even if the max_salary is not part of the projection: {code:SQL} SELECT `employees`.`last_name`, `employees`.`first_name` FROM `default`.`employees` INNER JOIN (SELECT `emp_no`, MAX(`salary`) `max_salary` FROM `default`.`salaries` WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL GROUP BY `emp_no`) AS `t1` ON `employees`.`emp_no` = `t1`.`emp_no` ORDER BY `t1`.`max_salary` DESC; {code} However, that fails with this error: {code} Error while compiling statement: FAILED: SemanticException [Error 10004]: line 9:9 Invalid table alias or column reference 't1': (possible column names are: last_name, first_name) {code} FWIW, this also fails: {code:SQL} SELECT `employees`.`last_name`, `employees`.`first_name`, `t1`.`max_salary` AS `max_sal` FROM `default`.`employees` INNER JOIN (SELECT `emp_no`, MAX(`salary`) `max_salary` FROM `default`.`salaries` WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL GROUP BY `emp_no`) AS `t1` ON `employees`.`emp_no` = `t1`.`emp_no` ORDER BY `t1`.`max_salary` DESC; {code} But this succeeds: {code:SQL} SELECT `employees`.`last_name`, `employees`.`first_name`, `t1`.`max_salary` AS `max_sal` FROM `default`.`employees` INNER JOIN (SELECT `emp_no`, MAX(`salary`) `max_salary` FROM `default`.`salaries` WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL GROUP BY `emp_no`) AS `t1` ON `employees`.`emp_no` = `t1`.`emp_no` ORDER BY `max_sal` DESC; {code} was: Give these tables: {code:SQL} CREATE TABLE employees ( emp_no INT, first_name VARCHAR(14), last_name VARCHAR(16) ); insert into employees values (1, 'Gottlob', 'Frege'), (2, 'Bertrand', 'Russell'), (3, 'Ludwig', 'Wittgenstein'); CREATE TABLE salaries ( emp_no INT, salary INT, from_date DATE, to_date DATE ); insert into salaries values (1, 10, '1900-01-01', '1900-01-31'), (1, 18, '1900-09-01', '1900-09-30'), (2, 15, '1940-03-01', '1950-01-01'), (3, 20, '1920-01-01', '1950-01-01'); {code} This query returns the names of the employees ordered by their peak salary: {code:SQL} SELECT `employees`.`last_name`, `employees`.`first_name`, `t1`.`max_salary` FROM `default`.`employees` INNER JOIN (SELECT `emp_no`, MAX(`salary`) `max_salary` FROM `default`.`salaries` WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL GROUP BY `emp_no`) AS `t1` ON `employees`.`emp_no` = `t1`.`emp_no` ORDER BY `t1`.`max_salary` DESC; {code} However, this should still work even if the max_salary is not part of the projection: {code:SQL} SELECT `employees`.`last_name`, `employees`.`first_name` FROM `default`.`employees` INNER JOIN (SELECT `emp_no`, MAX(`salary`) `max_salary` FROM `default`.`salaries` WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL GROUP BY `emp_no`) AS `t1` ON `employees`.`emp_no` = `t1`.`emp_no` ORDER BY `t1`.`max_salary` DESC; {code} However, that fails with this error: {code} Error while compiling statement: FAILED: SemanticException [Error 10004]: line 9:9 Invalid table alias or column reference 't1': (possible column names are: last_name, first_name) {code} > Cannot ORDER by subquery fields unless they are selected > > > Key: HIVE-18375 > URL: https://issues.apache.org/jira/browse/HIVE-18375 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 2.3.2 > Environment: Amazon AWS > Release label:emr-5.11.0 > Hadoop distribution:Amazon 2.7.3 > Applications:Hive 2.3.2, Pig 0.17.0, Hue 4.0.1 > classification=hive-site,properties=[hive.strict.checks.cartesian.product=false,hive.mapred.mode=nonstrict] >Reporter: Paul Jackson >Priority: Minor > > Give these tables: > {code:
[jira] [Commented] (HIVE-18366) Update HBaseSerDe to use hbase.mapreduce.hfileoutputformat.table.name instead of hbase.table.name as the table name property
[ https://issues.apache.org/jira/browse/HIVE-18366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312014#comment-16312014 ] Aihua Xu commented on HIVE-18366: - Attached patch-1: replace the old property name with the new one. Also added a qtest to make sure the right property name is used. > Update HBaseSerDe to use hbase.mapreduce.hfileoutputformat.table.name instead > of hbase.table.name as the table name property > > > Key: HIVE-18366 > URL: https://issues.apache.org/jira/browse/HIVE-18366 > Project: Hive > Issue Type: Sub-task > Components: HBase Handler >Affects Versions: 3.0.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-18366.1.patch > > > HBase 2.0 changes the table name property to > hbase.mapreduce.hfileoutputformat.table.name. HiveHFileOutputFormat is using > the new property name while HiveHBaseTableOutputFormat is not. If we create > the table as follows, HiveHBaseTableOutputFormat is used which still uses the > old property hbase.table.name. > {noformat} > create table hbase_table2(key int, val string) stored by > 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' with serdeproperties > ('hbase.columns.mapping' = ':key,cf:val') tblproperties > ('hbase.mapreduce.hfileoutputformat.table.name' = > 'positive_hbase_handler_bulk') > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-18274) add AM level metrics for WM
[ https://issues.apache.org/jira/browse/HIVE-18274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312017#comment-16312017 ] Hive QA commented on HIVE-18274: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12904657/HIVE-18274.01.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 21 failed/errored test(s), 11516 tests executed *Failed tests:* {noformat} TestMiniLlapLocalCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=161) [dynamic_semijoin_reduction.q,materialized_view_create_rewrite_3.q,vectorization_pushdown.q,correlationoptimizer2.q,cbo_gby_empty.q,vectorization_short_regress.q,identity_project_remove_skip.q,mapjoin3.q,cross_product_check_1.q,unionDistinct_3.q,cbo_join.q,correlationoptimizer6.q,union_remove_26.q,cbo_rp_limit.q,vector_groupby_cube1.q,current_date_timestamp.q,union2.q,groupby2.q,schema_evol_text_vec_table.q,dynpart_sort_opt_vectorization.q,exchgpartition2lel.q,multiMapJoin1.q,sample10.q,vectorized_timestamp_ints_casts.q,vector_char_simple.q,auto_sortmerge_join_2.q,bucketizedhiveinputformat.q,vectorization_input_format_excludes.q,cte_mat_2.q,vectorization_8.q] org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[rcfile_format_nonpart] (batchId=248) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join25] (batchId=72) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mapjoin_hook] (batchId=12) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=35) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucketsortoptimize_insert_2] (batchId=151) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[hybridgrace_hashjoin_2] (batchId=156) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=164) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] (batchId=168) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast] (batchId=159) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] (batchId=159) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[bucketizedhiveinputformat] (batchId=177) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[authorization_part] (batchId=93) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] (batchId=120) org.apache.hadoop.hive.metastore.TestEmbeddedHiveMetaStore.testTransactionalValidation (batchId=213) org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testApplyPlanQpChanges (batchId=284) org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=253) org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints (batchId=225) org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=231) org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=231) org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=231) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8445/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8445/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8445/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 21 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12904657 - PreCommit-HIVE-Build > add AM level metrics for WM > --- > > Key: HIVE-18274 > URL: https://issues.apache.org/jira/browse/HIVE-18274 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-18274.01.patch, HIVE-18274.patch > > > Unused guaranteed tasks (1 metric); guaranteed/speculative tasks x > updated/update in progress (4 metrics). > It should be possible to view those over time as the query is (was) running, > to detect any anomalies. This jira is just to save the correct metrics. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18366) Update HBaseSerDe to use hbase.mapreduce.hfileoutputformat.table.name instead of hbase.table.name as the table name property
[ https://issues.apache.org/jira/browse/HIVE-18366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-18366: Status: Patch Available (was: Open) > Update HBaseSerDe to use hbase.mapreduce.hfileoutputformat.table.name instead > of hbase.table.name as the table name property > > > Key: HIVE-18366 > URL: https://issues.apache.org/jira/browse/HIVE-18366 > Project: Hive > Issue Type: Sub-task > Components: HBase Handler >Affects Versions: 3.0.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-18366.1.patch > > > HBase 2.0 changes the table name property to > hbase.mapreduce.hfileoutputformat.table.name. HiveHFileOutputFormat is using > the new property name while HiveHBaseTableOutputFormat is not. If we create > the table as follows, HiveHBaseTableOutputFormat is used which still uses the > old property hbase.table.name. > {noformat} > create table hbase_table2(key int, val string) stored by > 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' with serdeproperties > ('hbase.columns.mapping' = ':key,cf:val') tblproperties > ('hbase.mapreduce.hfileoutputformat.table.name' = > 'positive_hbase_handler_bulk') > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18368) Improve SparkPlan Graph
[ https://issues.apache.org/jira/browse/HIVE-18368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Takiar updated HIVE-18368: Description: The {{SparkPlan}} class does some logging to show the mapping between different {{SparkTran}}, what shuffle types are used, and what trans are cached. However, there is room for improvement. When debug logging is enabled the RDD graph is logged, but there isn't much information printed about each RDD. We should combine both of the graphs and improve them. We could even make the Spark Plan graph part of the {{EXPLAIN EXTENDED}} output. Ideally, the final graph shows a clear relationship between Tran objects, RDDs, and BaseWorks. Edge should include information about number of partitions, shuffle types, Spark operations used, etc. was: The {{SparkPlan}} class does some logging to show the mapping between different {{SparkTran}}s, what shuffle types are used, and what trans are cached. However, there is room for improvement. When debug logging is enabled the RDD graph is logged, but there isn't much information printed about each RDD. We should combine both of the graphs and improve them. We could even make the Spark Plan graph part of the {{EXPLAIN EXTENDED}} output. Ideally, the final graph shows a clear relationship between Tran objects, RDDs, and BaseWorks. Edge should include information about number of partitions, shuffle types, Spark operations used, etc. > Improve SparkPlan Graph > --- > > Key: HIVE-18368 > URL: https://issues.apache.org/jira/browse/HIVE-18368 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar > > The {{SparkPlan}} class does some logging to show the mapping between > different {{SparkTran}}, what shuffle types are used, and what trans are > cached. However, there is room for improvement. > When debug logging is enabled the RDD graph is logged, but there isn't much > information printed about each RDD. > We should combine both of the graphs and improve them. We could even make the > Spark Plan graph part of the {{EXPLAIN EXTENDED}} output. > Ideally, the final graph shows a clear relationship between Tran objects, > RDDs, and BaseWorks. Edge should include information about number of > partitions, shuffle types, Spark operations used, etc. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18366) Update HBaseSerDe to use hbase.mapreduce.hfileoutputformat.table.name instead of hbase.table.name as the table name property
[ https://issues.apache.org/jira/browse/HIVE-18366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-18366: Attachment: HIVE-18366.1.patch > Update HBaseSerDe to use hbase.mapreduce.hfileoutputformat.table.name instead > of hbase.table.name as the table name property > > > Key: HIVE-18366 > URL: https://issues.apache.org/jira/browse/HIVE-18366 > Project: Hive > Issue Type: Sub-task > Components: HBase Handler >Affects Versions: 3.0.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-18366.1.patch > > > HBase 2.0 changes the table name property to > hbase.mapreduce.hfileoutputformat.table.name. HiveHFileOutputFormat is using > the new property name while HiveHBaseTableOutputFormat is not. If we create > the table as follows, HiveHBaseTableOutputFormat is used which still uses the > old property hbase.table.name. > {noformat} > create table hbase_table2(key int, val string) stored by > 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' with serdeproperties > ('hbase.columns.mapping' = ':key,cf:val') tblproperties > ('hbase.mapreduce.hfileoutputformat.table.name' = > 'positive_hbase_handler_bulk') > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17718) Hive on Spark Debugging Improvements
[ https://issues.apache.org/jira/browse/HIVE-17718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Takiar updated HIVE-17718: Summary: Hive on Spark Debugging Improvements (was: Spark Logging Improvements) > Hive on Spark Debugging Improvements > > > Key: HIVE-17718 > URL: https://issues.apache.org/jira/browse/HIVE-17718 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar > > There are multiple places where it is hard to debug HoS - e.g. the HoS Remote > Driver and Client, the Spark RDD graph, etc. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-18375) Cannot ORDER by subquery fields unless they are selected
[ https://issues.apache.org/jira/browse/HIVE-18375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16311995#comment-16311995 ] Andrew Sherman commented on HIVE-18375: --- Great bug report. [~minions] do you want to take a look at this? > Cannot ORDER by subquery fields unless they are selected > > > Key: HIVE-18375 > URL: https://issues.apache.org/jira/browse/HIVE-18375 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 2.3.2 > Environment: Amazon AWS > Release label:emr-5.11.0 > Hadoop distribution:Amazon 2.7.3 > Applications:Hive 2.3.2, Pig 0.17.0, Hue 4.0.1 > classification=hive-site,properties=[hive.strict.checks.cartesian.product=false,hive.mapred.mode=nonstrict] >Reporter: Paul Jackson >Priority: Minor > > Give these tables: > {code:SQL} > CREATE TABLE employees ( > emp_no INT, > first_name VARCHAR(14), > last_name VARCHAR(16) > ); > insert into employees values > (1, 'Gottlob', 'Frege'), > (2, 'Bertrand', 'Russell'), > (3, 'Ludwig', 'Wittgenstein'); > CREATE TABLE salaries ( > emp_no INT, > salary INT, > from_date DATE, > to_date DATE > ); > insert into salaries values > (1, 10, '1900-01-01', '1900-01-31'), > (1, 18, '1900-09-01', '1900-09-30'), > (2, 15, '1940-03-01', '1950-01-01'), > (3, 20, '1920-01-01', '1950-01-01'); > {code} > This query returns the names of the employees ordered by their peak salary: > {code:SQL} > SELECT `employees`.`last_name`, `employees`.`first_name`, `t1`.`max_salary` > FROM `default`.`employees` > INNER JOIN > (SELECT `emp_no`, MAX(`salary`) `max_salary` > FROM `default`.`salaries` > WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL > GROUP BY `emp_no`) AS `t1` > ON `employees`.`emp_no` = `t1`.`emp_no` > ORDER BY `t1`.`max_salary` DESC; > {code} > However, this should still work even if the max_salary is not part of the > projection: > {code:SQL} > SELECT `employees`.`last_name`, `employees`.`first_name` > FROM `default`.`employees` > INNER JOIN > (SELECT `emp_no`, MAX(`salary`) `max_salary` > FROM `default`.`salaries` > WHERE `emp_no` IS NOT NULL AND `salary` IS NOT NULL > GROUP BY `emp_no`) AS `t1` > ON `employees`.`emp_no` = `t1`.`emp_no` > ORDER BY `t1`.`max_salary` DESC; > {code} > However, that fails with this error: > {code} > Error while compiling statement: FAILED: SemanticException [Error 10004]: > line 9:9 Invalid table alias or column reference 't1': (possible column names > are: last_name, first_name) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18349) Misc metastore changes for debuggability
[ https://issues.apache.org/jira/browse/HIVE-18349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-18349: - Attachment: HIVE-18349.3.patch Fixed TestMetaStoreEndFunctionListener test failure. Other test failures seems to be happening in master already. > Misc metastore changes for debuggability > > > Key: HIVE-18349 > URL: https://issues.apache.org/jira/browse/HIVE-18349 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-18349.1.patch, HIVE-18349.2.patch, > HIVE-18349.3.patch > > > 1) Hive metastore audit event log/metastore log does not log the final status > (success or failed) of the event. Some operations like for example, > drop_table returns a boolean success flag but it never gets logged anywhere. > However the same is sent to end event listeners or other metastore event > listeners. It will be good to log the final status of the events. > 2) Make connection timeout when using connection pool configurable. Currently > its hard coded to 30 seconds. > 3) Provide a config to enable connection leak detection for HikariCP or > enable when debug logging is enabled. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-18274) add AM level metrics for WM
[ https://issues.apache.org/jira/browse/HIVE-18274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16311935#comment-16311935 ] Hive QA commented on HIVE-18274: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 39s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 14s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 10s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 10s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 14s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 10s{color} | {color:red} llap-tez: The patch generated 7 new + 174 unchanged - 4 fixed = 181 total (was 178) {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 10s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 11s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 8m 32s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh | | git revision | master / 3f5148d | | Default Java | 1.8.0_111 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-8445/yetus/diff-checkstyle-llap-tez.txt | | whitespace | http://104.198.109.242/logs//PreCommit-HIVE-Build-8445/yetus/whitespace-eol.txt | | modules | C: llap-tez U: llap-tez | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-8445/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > add AM level metrics for WM > --- > > Key: HIVE-18274 > URL: https://issues.apache.org/jira/browse/HIVE-18274 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-18274.01.patch, HIVE-18274.patch > > > Unused guaranteed tasks (1 metric); guaranteed/speculative tasks x > updated/update in progress (4 metrics). > It should be possible to view those over time as the query is (was) running, > to detect any anomalies. This jira is just to save the correct metrics. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-18352) introduce a METADATAONLY option while doing REPL DUMP to allow integrations of other tools
[ https://issues.apache.org/jira/browse/HIVE-18352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16311920#comment-16311920 ] Hive QA commented on HIVE-18352: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12904562/HIVE-18352.0.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 117 failed/errored test(s), 11546 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_addpartition_blobstore_to_blobstore] (batchId=248) org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_addpartition_blobstore_to_local] (batchId=248) org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_addpartition_blobstore_to_warehouse] (batchId=248) org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_addpartition_local_to_blobstore] (batchId=248) org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_blobstore_to_blobstore] (batchId=248) org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_blobstore_to_blobstore_nonpart] (batchId=248) org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_blobstore_to_local] (batchId=248) org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_blobstore_to_warehouse] (batchId=248) org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_blobstore_to_warehouse_nonpart] (batchId=248) org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_local_to_blobstore] (batchId=248) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join25] (batchId=72) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_00_nonpart_empty] (batchId=14) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_01_nonpart] (batchId=54) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_02_00_part_empty] (batchId=66) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_02_part] (batchId=50) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_03_nonpart_over_compat] (batchId=6) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_04_all_part] (batchId=28) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_04_evolved_parts] (batchId=31) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_05_some_part] (batchId=73) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_06_one_part] (batchId=86) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_07_all_part_over_nonoverlap] (batchId=10) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_08_nonpart_rename] (batchId=61) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_09_part_spec_nonoverlap] (batchId=8) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_10_external_managed] (batchId=69) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_11_managed_external] (batchId=69) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_12_external_location] (batchId=55) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_13_managed_location] (batchId=39) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_14_managed_location_over_existing] (batchId=54) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_15_external_part] (batchId=39) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_16_part_external] (batchId=59) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_17_part_managed] (batchId=44) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_18_part_external] (batchId=71) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_19_00_part_external_location] (batchId=66) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_19_part_external_location] (batchId=27) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_20_part_managed_location] (batchId=38) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_22_import_exist_authsuccess] (batchId=18) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_23_import_part_authsuccess] (batchId=21) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_24_import_nonexist_authsuccess] (batchId=19) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_hidden_files] (batchId=47) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mapjoin_hook] (batchId=12) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=35) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[repl_2_exim_basic] (batchId=78) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1] (batchId=169) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucketsortoptimize_insert_2] (b
[jira] [Updated] (HIVE-18328) Improve schematool validator to report duplicate rows for column statistics
[ https://issues.apache.org/jira/browse/HIVE-18328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naveen Gangam updated HIVE-18328: - Status: Patch Available (was: Open) > Improve schematool validator to report duplicate rows for column statistics > --- > > Key: HIVE-18328 > URL: https://issues.apache.org/jira/browse/HIVE-18328 > Project: Hive > Issue Type: Improvement > Components: Hive >Affects Versions: 2.1.1 >Reporter: Naveen Gangam >Assignee: Naveen Gangam > Attachments: HIVE-18328.patch > > > By design, in the {{TAB_COL_STATS}} table of the HMS schema, there should be > ONE AND ONLY ONE row, representing its statistics, for each column defined in > hive. A combination of DB_NAME, TABLE_NAME and COLUMN_NAME constitute a > primary key/unique row. > Each time the statistics are computed for a column, this row is updated. > However, if somehow via BDR/replication process, we end up with multiple > rows in this table for a given column, HMS server to recompute the statistics > there after. > So it would be good to detect this data anamoly via the schema validation > tool. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18274) add AM level metrics for WM
[ https://issues.apache.org/jira/browse/HIVE-18274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-18274: Attachment: HIVE-18274.01.patch Rebased and updated the patch. I also noticed while modifying that for a new task, symmetrical to a finished task, only one counter needs to be updated. > add AM level metrics for WM > --- > > Key: HIVE-18274 > URL: https://issues.apache.org/jira/browse/HIVE-18274 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-18274.01.patch, HIVE-18274.patch > > > Unused guaranteed tasks (1 metric); guaranteed/speculative tasks x > updated/update in progress (4 metrics). > It should be possible to view those over time as the query is (was) running, > to detect any anomalies. This jira is just to save the correct metrics. -- This message was sent by Atlassian JIRA (v6.4.14#64029)