[jira] [Comment Edited] (HIVE-24083) hcatalog error in Hadoop 3.3.0: authentication type needed
[ https://issues.apache.org/jira/browse/HIVE-24083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17583397#comment-17583397 ] Zhiguo Wu edited comment on HIVE-24083 at 8/23/22 6:42 AM: --- [~zabetak] With patch: [https://github.com/apache/hive/pull/3543] Now this can be reproduced in master branch Should I create another PR for master branch? {code:java} ERROR | 23 Aug 2022 06:34:27,027 | org.apache.hive.hcatalog.templeton.Main | Server failed to start: javax.servlet.ServletException: Authentication type must be specified: simple|kerberos|https://github.com/apache/hive/pull/3543] Now this can be reproduced in master branch {code:java} ERROR | 23 Aug 2022 06:34:27,027 | org.apache.hive.hcatalog.templeton.Main | Server failed to start: javax.servlet.ServletException: Authentication type must be specified: simple|kerberos| at org.apache.hadoop.security.authentication.server.AuthenticationFilter.init(AuthenticationFilter.java:164) ~[hadoop-auth-3.3.4.jar:?] at org.apache.hadoop.security.authentication.server.ProxyUserAuthenticationFilter.init(ProxyUserAuthenticationFilter.java:57) ~[hadoop-common-3.3.4.jar:?] at org.eclipse.jetty.servlet.FilterHolder.initialize(FilterHolder.java:140) ~[jetty-runner-9.4.40.v20210413.jar:9.4.40.v20210413] at org.eclipse.jetty.servlet.ServletHandler.lambda$initialize$0(ServletHandler.java:731) ~[jetty-runner-9.4.40.v20210413.jar:9.4.40.v20210413] at java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948) ~[?:1.8.0_342] at java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:742) ~[?:1.8.0_342] at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:647) ~[?:1.8.0_342] at org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:755) ~[jetty-runner-9.4.40.v20210413.jar:9.4.40.v20210413] at org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:379) ~[jetty-runner-9.4.40.v20210413.jar:9.4.40.v20210413] at org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:911) ~[jetty-runner-9.4.40.v20210413.jar:9.4.40.v20210413] at org.eclipse.jetty.servlet.ServletContextHandler.doStart(ServletContextHandler.java:288) ~[jetty-runner-9.4.40.v20210413.jar:9.4.40.v20210413] at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73) ~[jetty-runner-9.4.40.v20210413.jar:9.4.40.v20210413] at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:169) ~[jetty-runner-9.4.40.v20210413.jar:9.4.40.v20210413] at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:117) ~[jetty-runner-9.4.40.v20210413.jar:9.4.40.v20210413] at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:97) ~[jetty-runner-9.4.40.v20210413.jar:9.4.40.v20210413] at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73) ~[jetty-runner-9.4.40.v20210413.jar:9.4.40.v20210413] at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:169) ~[jetty-runner-9.4.40.v20210413.jar:9.4.40.v20210413] at org.eclipse.jetty.server.Server.start(Server.java:423) ~[jetty-runner-9.4.40.v20210413.jar:9.4.40.v20210413] at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:110) ~[jetty-runner-9.4.40.v20210413.jar:9.4.40.v20210413] at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:97) ~[jetty-runner-9.4.40.v20210413.jar:9.4.40.v20210413] at org.eclipse.jetty.server.Server.doStart(Server.java:387) ~[jetty-runner-9.4.40.v20210413.jar:9.4.40.v20210413] at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73) ~[jetty-runner-9.4.40.v20210413.jar:9.4.40.v20210413] at org.apache.hive.hcatalog.templeton.Main.runServer(Main.java:255) ~[hive-webhcat-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hive.hcatalog.templeton.Main.run(Main.java:147) [hive-webhcat-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hive.hcatalog.templeton.Main.main(Main.java:394) [hive-webhcat-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_342] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_342] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_342] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_342] at org.apache.hadoop.util.RunJar.run(RunJar.java:323) [hadoop-common-3.3.4.jar:?] at or
[jira] [Commented] (HIVE-24083) hcatalog error in Hadoop 3.3.0: authentication type needed
[ https://issues.apache.org/jira/browse/HIVE-24083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17583397#comment-17583397 ] Zhiguo Wu commented on HIVE-24083: -- [~zabetak] With patch: [https://github.com/apache/hive/pull/3543] Now this can be reproduced in master branch {code:java} ERROR | 23 Aug 2022 06:34:27,027 | org.apache.hive.hcatalog.templeton.Main | Server failed to start: javax.servlet.ServletException: Authentication type must be specified: simple|kerberos| at org.apache.hadoop.security.authentication.server.AuthenticationFilter.init(AuthenticationFilter.java:164) ~[hadoop-auth-3.3.4.jar:?] at org.apache.hadoop.security.authentication.server.ProxyUserAuthenticationFilter.init(ProxyUserAuthenticationFilter.java:57) ~[hadoop-common-3.3.4.jar:?] at org.eclipse.jetty.servlet.FilterHolder.initialize(FilterHolder.java:140) ~[jetty-runner-9.4.40.v20210413.jar:9.4.40.v20210413] at org.eclipse.jetty.servlet.ServletHandler.lambda$initialize$0(ServletHandler.java:731) ~[jetty-runner-9.4.40.v20210413.jar:9.4.40.v20210413] at java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948) ~[?:1.8.0_342] at java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:742) ~[?:1.8.0_342] at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:647) ~[?:1.8.0_342] at org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:755) ~[jetty-runner-9.4.40.v20210413.jar:9.4.40.v20210413] at org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:379) ~[jetty-runner-9.4.40.v20210413.jar:9.4.40.v20210413] at org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:911) ~[jetty-runner-9.4.40.v20210413.jar:9.4.40.v20210413] at org.eclipse.jetty.servlet.ServletContextHandler.doStart(ServletContextHandler.java:288) ~[jetty-runner-9.4.40.v20210413.jar:9.4.40.v20210413] at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73) ~[jetty-runner-9.4.40.v20210413.jar:9.4.40.v20210413] at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:169) ~[jetty-runner-9.4.40.v20210413.jar:9.4.40.v20210413] at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:117) ~[jetty-runner-9.4.40.v20210413.jar:9.4.40.v20210413] at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:97) ~[jetty-runner-9.4.40.v20210413.jar:9.4.40.v20210413] at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73) ~[jetty-runner-9.4.40.v20210413.jar:9.4.40.v20210413] at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:169) ~[jetty-runner-9.4.40.v20210413.jar:9.4.40.v20210413] at org.eclipse.jetty.server.Server.start(Server.java:423) ~[jetty-runner-9.4.40.v20210413.jar:9.4.40.v20210413] at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:110) ~[jetty-runner-9.4.40.v20210413.jar:9.4.40.v20210413] at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:97) ~[jetty-runner-9.4.40.v20210413.jar:9.4.40.v20210413] at org.eclipse.jetty.server.Server.doStart(Server.java:387) ~[jetty-runner-9.4.40.v20210413.jar:9.4.40.v20210413] at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73) ~[jetty-runner-9.4.40.v20210413.jar:9.4.40.v20210413] at org.apache.hive.hcatalog.templeton.Main.runServer(Main.java:255) ~[hive-webhcat-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hive.hcatalog.templeton.Main.run(Main.java:147) [hive-webhcat-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hive.hcatalog.templeton.Main.main(Main.java:394) [hive-webhcat-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_342] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_342] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_342] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_342] at org.apache.hadoop.util.RunJar.run(RunJar.java:323) [hadoop-common-3.3.4.jar:?] at org.apache.hadoop.util.RunJar.main(RunJar.java:236) [hadoop-common-3.3.4.jar:?] {code} > hcatalog error in Hadoop 3.3.0: authentication type needed > -- > > Key: HIVE-24083 > URL: https://issues.apache.org/jira/browse/HIVE-24083 > Project: Hive > Issue Type: Bug > Com
[jira] [Updated] (HIVE-26286) Hive WebHCat Tests are failing
[ https://issues.apache.org/jira/browse/HIVE-26286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhiguo Wu updated HIVE-26286: - Status: Patch Available (was: Open) > Hive WebHCat Tests are failing > -- > > Key: HIVE-26286 > URL: https://issues.apache.org/jira/browse/HIVE-26286 > Project: Hive > Issue Type: Bug > Components: HCatalog >Affects Versions: 4.0.0 > Environment: [link title|http://example.com] >Reporter: Anmol Sundaram >Assignee: Zhiguo Wu >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > The Hive TestWebHCatE2e tests seem to be failing due to > > {quote}templeton: Server failed to start: null > [main] ERROR org.apache.hive.hcatalog.templeton.Main - Server failed to > start: > java.lang.NullPointerException > at > org.eclipse.jetty.server.AbstractConnector.(AbstractConnector.java:174) > at > org.eclipse.jetty.server.AbstractNetworkConnector.(AbstractNetworkConnector.java:44) > at org.eclipse.jetty.server.ServerConnector.(ServerConnector.java:220) > at org.eclipse.jetty.server.ServerConnector.(ServerConnector.java:143) > at > org.apache.hive.hcatalog.templeton.Main.createChannelConnector(Main.java:295) > at org.apache.hive.hcatalog.templeton.Main.runServer(Main.java:252) > at org.apache.hive.hcatalog.templeton.Main.run(Main.java:147) > at > org.apache.hive.hcatalog.templeton.TestWebHCatE2e.startHebHcatInMem(TestWebHCatE2e.java:94) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59){quote} > {quote} {quote} > This seems to be caused due to HIVE-18728 , which is breaking. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-26286) Hive WebHCat Tests are failing
[ https://issues.apache.org/jira/browse/HIVE-26286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-26286: -- Labels: pull-request-available (was: ) > Hive WebHCat Tests are failing > -- > > Key: HIVE-26286 > URL: https://issues.apache.org/jira/browse/HIVE-26286 > Project: Hive > Issue Type: Bug > Components: HCatalog >Affects Versions: 4.0.0 > Environment: [link title|http://example.com] >Reporter: Anmol Sundaram >Assignee: Zhiguo Wu >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > The Hive TestWebHCatE2e tests seem to be failing due to > > {quote}templeton: Server failed to start: null > [main] ERROR org.apache.hive.hcatalog.templeton.Main - Server failed to > start: > java.lang.NullPointerException > at > org.eclipse.jetty.server.AbstractConnector.(AbstractConnector.java:174) > at > org.eclipse.jetty.server.AbstractNetworkConnector.(AbstractNetworkConnector.java:44) > at org.eclipse.jetty.server.ServerConnector.(ServerConnector.java:220) > at org.eclipse.jetty.server.ServerConnector.(ServerConnector.java:143) > at > org.apache.hive.hcatalog.templeton.Main.createChannelConnector(Main.java:295) > at org.apache.hive.hcatalog.templeton.Main.runServer(Main.java:252) > at org.apache.hive.hcatalog.templeton.Main.run(Main.java:147) > at > org.apache.hive.hcatalog.templeton.TestWebHCatE2e.startHebHcatInMem(TestWebHCatE2e.java:94) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59){quote} > {quote} {quote} > This seems to be caused due to HIVE-18728 , which is breaking. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26286) Hive WebHCat Tests are failing
[ https://issues.apache.org/jira/browse/HIVE-26286?focusedWorklogId=802710&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-802710 ] ASF GitHub Bot logged work on HIVE-26286: - Author: ASF GitHub Bot Created on: 23/Aug/22 06:36 Start Date: 23/Aug/22 06:36 Worklog Time Spent: 10m Work Description: kevinw66 opened a new pull request, #3543: URL: https://github.com/apache/hive/pull/3543 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? Issue Time Tracking --- Worklog Id: (was: 802710) Remaining Estimate: 0h Time Spent: 10m > Hive WebHCat Tests are failing > -- > > Key: HIVE-26286 > URL: https://issues.apache.org/jira/browse/HIVE-26286 > Project: Hive > Issue Type: Bug > Components: HCatalog >Affects Versions: 4.0.0 > Environment: [link title|http://example.com] >Reporter: Anmol Sundaram >Assignee: Zhiguo Wu >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > The Hive TestWebHCatE2e tests seem to be failing due to > > {quote}templeton: Server failed to start: null > [main] ERROR org.apache.hive.hcatalog.templeton.Main - Server failed to > start: > java.lang.NullPointerException > at > org.eclipse.jetty.server.AbstractConnector.(AbstractConnector.java:174) > at > org.eclipse.jetty.server.AbstractNetworkConnector.(AbstractNetworkConnector.java:44) > at org.eclipse.jetty.server.ServerConnector.(ServerConnector.java:220) > at org.eclipse.jetty.server.ServerConnector.(ServerConnector.java:143) > at > org.apache.hive.hcatalog.templeton.Main.createChannelConnector(Main.java:295) > at org.apache.hive.hcatalog.templeton.Main.runServer(Main.java:252) > at org.apache.hive.hcatalog.templeton.Main.run(Main.java:147) > at > org.apache.hive.hcatalog.templeton.TestWebHCatE2e.startHebHcatInMem(TestWebHCatE2e.java:94) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59){quote} > {quote} {quote} > This seems to be caused due to HIVE-18728 , which is breaking. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HIVE-26286) Hive WebHCat Tests are failing
[ https://issues.apache.org/jira/browse/HIVE-26286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhiguo Wu reassigned HIVE-26286: Assignee: Zhiguo Wu > Hive WebHCat Tests are failing > -- > > Key: HIVE-26286 > URL: https://issues.apache.org/jira/browse/HIVE-26286 > Project: Hive > Issue Type: Bug > Components: HCatalog >Affects Versions: 4.0.0 > Environment: [link title|http://example.com] >Reporter: Anmol Sundaram >Assignee: Zhiguo Wu >Priority: Major > > The Hive TestWebHCatE2e tests seem to be failing due to > > {quote}templeton: Server failed to start: null > [main] ERROR org.apache.hive.hcatalog.templeton.Main - Server failed to > start: > java.lang.NullPointerException > at > org.eclipse.jetty.server.AbstractConnector.(AbstractConnector.java:174) > at > org.eclipse.jetty.server.AbstractNetworkConnector.(AbstractNetworkConnector.java:44) > at org.eclipse.jetty.server.ServerConnector.(ServerConnector.java:220) > at org.eclipse.jetty.server.ServerConnector.(ServerConnector.java:143) > at > org.apache.hive.hcatalog.templeton.Main.createChannelConnector(Main.java:295) > at org.apache.hive.hcatalog.templeton.Main.runServer(Main.java:252) > at org.apache.hive.hcatalog.templeton.Main.run(Main.java:147) > at > org.apache.hive.hcatalog.templeton.TestWebHCatE2e.startHebHcatInMem(TestWebHCatE2e.java:94) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59){quote} > {quote} {quote} > This seems to be caused due to HIVE-18728 , which is breaking. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26479) Add ability to set parameters for query-based compaction
[ https://issues.apache.org/jira/browse/HIVE-26479?focusedWorklogId=802707&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-802707 ] ASF GitHub Bot logged work on HIVE-26479: - Author: ASF GitHub Bot Created on: 23/Aug/22 06:16 Start Date: 23/Aug/22 06:16 Worklog Time Spent: 10m Work Description: SourabhBadhya commented on code in PR #3528: URL: https://github.com/apache/hive/pull/3528#discussion_r952191534 ## standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionInfo.java: ## @@ -109,6 +109,13 @@ public void setProperty(String key, String value) { properties = propertiesMap.toString(); } + public StringableMap getPropertiesMap() { Review Comment: Please see [this](https://github.com/apache/hive/pull/3528#discussion_r952191010) comment. Issue Time Tracking --- Worklog Id: (was: 802707) Time Spent: 1h 20m (was: 1h 10m) > Add ability to set parameters for query-based compaction > > > Key: HIVE-26479 > URL: https://issues.apache.org/jira/browse/HIVE-26479 > Project: Hive > Issue Type: Improvement >Reporter: Sourabh Badhya >Assignee: Sourabh Badhya >Priority: Major > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > [HIVE-13354|https://issues.apache.org/jira/browse/HIVE-13354] introduced the > ability to set some parameters for the compaction through table properties, > like the mapper memory size or compaction thresholds. This could be useful > for the query-based compaction as well, for example if the insert of the > query-based compaction is failing, we would have a possibility to tune the > compaction run directly. First it should be investigated which properties are > possible and would make sense to set for the query base compaction. Then > implement this feature for the query-based compaction. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26479) Add ability to set parameters for query-based compaction
[ https://issues.apache.org/jira/browse/HIVE-26479?focusedWorklogId=802705&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-802705 ] ASF GitHub Bot logged work on HIVE-26479: - Author: ASF GitHub Bot Created on: 23/Aug/22 06:15 Start Date: 23/Aug/22 06:15 Worklog Time Spent: 10m Work Description: SourabhBadhya commented on code in PR #3528: URL: https://github.com/apache/hive/pull/3528#discussion_r952191010 ## ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/QueryCompactor.java: ## @@ -272,5 +277,23 @@ static void removeFilesForMmTable(HiveConf conf, AcidDirectory dir) throws IOExc fs.delete(dead, true); } } + +static void overrideConfProps(HiveConf conf, CompactionInfo ci, Map properties) { + for (String key : properties.keySet()) { +if (key.startsWith(COMPACTOR_PREFIX)) { + String property = key.substring(10); // 10 is the length of "compactor." We only keep the rest. + conf.set(property, properties.get(key)); +} + } + + // Give preference to properties coming from compaction + // over table properties + for (String key : ci.getPropertiesMap().keySet()) { Review Comment: This is required for the `ALTER TABLE COMPACT 'major' WITH TBLPROPERTIES OVERWRITE ()` scenario. Here the properties are entered into `COMPACTION_QUEUE` and when the table is picked for compaction, the table properties which are set in `CQ_TBLPROPERTIES` must also be picked. All this information is stored in CompactionInfo, hence the need to set the properties as here. Same behaviour is seen for MR-based compaction as well. Issue Time Tracking --- Worklog Id: (was: 802705) Time Spent: 1h 10m (was: 1h) > Add ability to set parameters for query-based compaction > > > Key: HIVE-26479 > URL: https://issues.apache.org/jira/browse/HIVE-26479 > Project: Hive > Issue Type: Improvement >Reporter: Sourabh Badhya >Assignee: Sourabh Badhya >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > [HIVE-13354|https://issues.apache.org/jira/browse/HIVE-13354] introduced the > ability to set some parameters for the compaction through table properties, > like the mapper memory size or compaction thresholds. This could be useful > for the query-based compaction as well, for example if the insert of the > query-based compaction is failing, we would have a possibility to tune the > compaction run directly. First it should be investigated which properties are > possible and would make sense to set for the query base compaction. Then > implement this feature for the query-based compaction. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (HIVE-26286) Hive WebHCat Tests are failing
[ https://issues.apache.org/jira/browse/HIVE-26286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17583350#comment-17583350 ] Zhiguo Wu edited comment on HIVE-26286 at 8/23/22 5:47 AM: --- this can also be reproduced by running {code:java} ./webhcat_server.sh start {code} was (Author: JIRAUSER294567): Also, this can be simply reproduced by running {code:java} ./webhcat_server.sh start {code} > Hive WebHCat Tests are failing > -- > > Key: HIVE-26286 > URL: https://issues.apache.org/jira/browse/HIVE-26286 > Project: Hive > Issue Type: Bug > Components: HCatalog >Affects Versions: 4.0.0 > Environment: [link title|http://example.com] >Reporter: Anmol Sundaram >Priority: Major > > The Hive TestWebHCatE2e tests seem to be failing due to > > {quote}templeton: Server failed to start: null > [main] ERROR org.apache.hive.hcatalog.templeton.Main - Server failed to > start: > java.lang.NullPointerException > at > org.eclipse.jetty.server.AbstractConnector.(AbstractConnector.java:174) > at > org.eclipse.jetty.server.AbstractNetworkConnector.(AbstractNetworkConnector.java:44) > at org.eclipse.jetty.server.ServerConnector.(ServerConnector.java:220) > at org.eclipse.jetty.server.ServerConnector.(ServerConnector.java:143) > at > org.apache.hive.hcatalog.templeton.Main.createChannelConnector(Main.java:295) > at org.apache.hive.hcatalog.templeton.Main.runServer(Main.java:252) > at org.apache.hive.hcatalog.templeton.Main.run(Main.java:147) > at > org.apache.hive.hcatalog.templeton.TestWebHCatE2e.startHebHcatInMem(TestWebHCatE2e.java:94) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59){quote} > {quote} {quote} > This seems to be caused due to HIVE-18728 , which is breaking. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (HIVE-26286) Hive WebHCat Tests are failing
[ https://issues.apache.org/jira/browse/HIVE-26286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17583350#comment-17583350 ] Zhiguo Wu edited comment on HIVE-26286 at 8/23/22 5:47 AM: --- This can also be reproduced by running {code:java} ./webhcat_server.sh start {code} was (Author: JIRAUSER294567): this can also be reproduced by running {code:java} ./webhcat_server.sh start {code} > Hive WebHCat Tests are failing > -- > > Key: HIVE-26286 > URL: https://issues.apache.org/jira/browse/HIVE-26286 > Project: Hive > Issue Type: Bug > Components: HCatalog >Affects Versions: 4.0.0 > Environment: [link title|http://example.com] >Reporter: Anmol Sundaram >Priority: Major > > The Hive TestWebHCatE2e tests seem to be failing due to > > {quote}templeton: Server failed to start: null > [main] ERROR org.apache.hive.hcatalog.templeton.Main - Server failed to > start: > java.lang.NullPointerException > at > org.eclipse.jetty.server.AbstractConnector.(AbstractConnector.java:174) > at > org.eclipse.jetty.server.AbstractNetworkConnector.(AbstractNetworkConnector.java:44) > at org.eclipse.jetty.server.ServerConnector.(ServerConnector.java:220) > at org.eclipse.jetty.server.ServerConnector.(ServerConnector.java:143) > at > org.apache.hive.hcatalog.templeton.Main.createChannelConnector(Main.java:295) > at org.apache.hive.hcatalog.templeton.Main.runServer(Main.java:252) > at org.apache.hive.hcatalog.templeton.Main.run(Main.java:147) > at > org.apache.hive.hcatalog.templeton.TestWebHCatE2e.startHebHcatInMem(TestWebHCatE2e.java:94) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59){quote} > {quote} {quote} > This seems to be caused due to HIVE-18728 , which is breaking. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26481) Cleaner fails with FileNotFoundException
[ https://issues.apache.org/jira/browse/HIVE-26481?focusedWorklogId=802699&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-802699 ] ASF GitHub Bot logged work on HIVE-26481: - Author: ASF GitHub Bot Created on: 23/Aug/22 05:38 Start Date: 23/Aug/22 05:38 Worklog Time Spent: 10m Work Description: rkirtir commented on code in PR #3531: URL: https://github.com/apache/hive/pull/3531#discussion_r952170558 ## ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java: ## @@ -1503,24 +1503,31 @@ public static Map getHdfsDirSnapshotsForCleaner(final Fil stack.push(fs.listStatusIterator(path)); while (!stack.isEmpty()) { RemoteIterator itr = stack.pop(); - while (itr.hasNext()) { -FileStatus fStatus = itr.next(); -Path fPath = fStatus.getPath(); -if (acidHiddenFileFilter.accept(fPath)) { - if (baseFileFilter.accept(fPath) || - deltaFileFilter.accept(fPath) || - deleteEventDeltaDirFilter.accept(fPath)) { -addToSnapshoot(dirToSnapshots, fPath); - } else { -if (fStatus.isDirectory()) { - stack.push(fs.listStatusIterator(fPath)); + try { +while (itr.hasNext()) { + FileStatus fStatus = itr.next(); + Path fPath = fStatus.getPath(); + if (acidHiddenFileFilter.accept(fPath)) { +if (baseFileFilter.accept(fPath) || +deltaFileFilter.accept(fPath) || +deleteEventDeltaDirFilter.accept(fPath)) { + addToSnapshoot(dirToSnapshots, fPath); } else { - // Found an original file - HdfsDirSnapshot hdfsDirSnapshot = addToSnapshoot(dirToSnapshots, fPath.getParent()); - hdfsDirSnapshot.addFile(fStatus); + if (fStatus.isDirectory()) { +stack.push(fs.listStatusIterator(fPath)); + } else { +// Found an original file +HdfsDirSnapshot hdfsDirSnapshot = addToSnapshoot(dirToSnapshots, fPath.getParent()); +hdfsDirSnapshot.addFile(fStatus); + } } } } + }catch(FileNotFoundException fne){ +//Ignore +//As current FS API doesn't provide the ability to supply a PathFilter to ignore the staging dirs, +// need to catch this exception Review Comment: Thanks Ayush for this. I have made necessary changes with RemoteIterators.filteringRemoteIterator Issue Time Tracking --- Worklog Id: (was: 802699) Time Spent: 2h 10m (was: 2h) > Cleaner fails with FileNotFoundException > > > Key: HIVE-26481 > URL: https://issues.apache.org/jira/browse/HIVE-26481 > Project: Hive > Issue Type: Bug >Reporter: KIRTI RUGE >Priority: Major > Labels: pull-request-available > Time Spent: 2h 10m > Remaining Estimate: 0h > > The compaction fails when the Cleaner tried to remove a missing directory > from HDFS. > {code:java} > 2022-08-05 18:56:38,873 INFO org.apache.hadoop.hive.ql.txn.compactor.Cleaner: > [Cleaner-executor-thread-0]: Starting cleaning for > id:30,dbname:default,tableName:test_concur_compaction_minor,partName:null,state:�,type:MINOR,enqueueTime:0,start:0,properties:null,runAs:hive,tooManyAborts:false,hasOldAbort:false,highestWriteId:4,errorMessage:null,workerId: > null,initiatorId: null 2022-08-05 18:56:38,888 ERROR > org.apache.hadoop.hive.ql.txn.compactor.Cleaner: [Cleaner-executor-thread-0]: > Caught exception when cleaning, unable to complete cleaning of > id:30,dbname:default,tableName:test_concur_compaction_minor,partName:null,state:�,type:MINOR,enqueueTime:0,start:0,properties:null,runAs:hive,tooManyAborts:false,hasOldAbort:false,highestWriteId:4,errorMessage:null,workerId: > null,initiatorId: null java.io.FileNotFoundException: File > hdfs://ns1/warehouse/tablespace/managed/hive/test_concur_compaction_minor/.hive-staging_hive_2022-08-05_18-56-37_115_5049319600695911622-37 > does not exist. at > org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1275) > at > org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1249) > at > org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1194) > at > org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1190) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.listLocatedStatus(DistributedFileSystem.java:1208) > at org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:2144) > at org.apache.
[jira] [Commented] (HIVE-26286) Hive WebHCat Tests are failing
[ https://issues.apache.org/jira/browse/HIVE-26286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17583350#comment-17583350 ] Zhiguo Wu commented on HIVE-26286: -- Also, this can be simply reproduced by running {code:java} ./webhcat_server.sh start {code} > Hive WebHCat Tests are failing > -- > > Key: HIVE-26286 > URL: https://issues.apache.org/jira/browse/HIVE-26286 > Project: Hive > Issue Type: Bug > Components: HCatalog >Affects Versions: 4.0.0 > Environment: [link title|http://example.com] >Reporter: Anmol Sundaram >Priority: Major > > The Hive TestWebHCatE2e tests seem to be failing due to > > {quote}templeton: Server failed to start: null > [main] ERROR org.apache.hive.hcatalog.templeton.Main - Server failed to > start: > java.lang.NullPointerException > at > org.eclipse.jetty.server.AbstractConnector.(AbstractConnector.java:174) > at > org.eclipse.jetty.server.AbstractNetworkConnector.(AbstractNetworkConnector.java:44) > at org.eclipse.jetty.server.ServerConnector.(ServerConnector.java:220) > at org.eclipse.jetty.server.ServerConnector.(ServerConnector.java:143) > at > org.apache.hive.hcatalog.templeton.Main.createChannelConnector(Main.java:295) > at org.apache.hive.hcatalog.templeton.Main.runServer(Main.java:252) > at org.apache.hive.hcatalog.templeton.Main.run(Main.java:147) > at > org.apache.hive.hcatalog.templeton.TestWebHCatE2e.startHebHcatInMem(TestWebHCatE2e.java:94) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59){quote} > {quote} {quote} > This seems to be caused due to HIVE-18728 , which is breaking. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26481) Cleaner fails with FileNotFoundException
[ https://issues.apache.org/jira/browse/HIVE-26481?focusedWorklogId=802689&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-802689 ] ASF GitHub Bot logged work on HIVE-26481: - Author: ASF GitHub Bot Created on: 23/Aug/22 04:41 Start Date: 23/Aug/22 04:41 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #3531: URL: https://github.com/apache/hive/pull/3531#issuecomment-1223531917 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=3531) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3531&resolved=false&types=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3531&resolved=false&types=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3531&resolved=false&types=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3531&resolved=false&types=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3531&resolved=false&types=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3531&resolved=false&types=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3531&resolved=false&types=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3531&resolved=false&types=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3531&resolved=false&types=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3531&resolved=false&types=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3531&resolved=false&types=CODE_SMELL) [12 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3531&resolved=false&types=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3531&metric=coverage&view=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3531&metric=duplicated_lines_density&view=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 802689) Time Spent: 2h (was: 1h 50m) > Cleaner fails with FileNotFoundException > > > Key: HIVE-26481 > URL: https://issues.apache.org/jira/browse/HIVE-26481 > Project: Hive > Issue Type: Bug >Reporter: KIRTI RUGE >Priority: Major > Labels: pull-request-available > Time Spent: 2h > Remaining Estimate: 0h > > The compaction fails when the Cleaner tried to remove a missing directory > from HDFS. > {code:java} > 2022-08-05 18:56:38,873 INFO org.apache.hadoop.hive.ql.txn.compactor.Cleaner: > [Cleaner-executor-thread-0]: Starting cleaning for > id:30,dbname:default,tableName:test_concur_compaction_minor,partName:null,state:�,type:MINOR,enqueueTime:0,start:0,properties:null,runAs:hive,tooManyAborts:false,hasOldAbort:false,highestWriteId:4,errorMessage:null,workerId: > null,initiatorId: null 2022-08-05 18:56:38,888 ERROR > org.apache.hadoop.hive.ql.txn.compactor.Cleaner: [Cleaner-executor-thread-0]: > Caught exception when cleaning, unable to complete cleaning of > id:30,dbname:d
[jira] [Commented] (HIVE-24083) hcatalog error in Hadoop 3.3.0: authentication type needed
[ https://issues.apache.org/jira/browse/HIVE-24083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1758#comment-1758 ] Zhiguo Wu commented on HIVE-24083: -- [~zabetak] Hi Stamatis. I tried to reproduce this in master branch, but it seems there is another problem we need to fix first. https://issues.apache.org/jira/browse/HIVE-26286 > hcatalog error in Hadoop 3.3.0: authentication type needed > -- > > Key: HIVE-24083 > URL: https://issues.apache.org/jira/browse/HIVE-24083 > Project: Hive > Issue Type: Bug > Components: HCatalog >Affects Versions: 3.1.2 >Reporter: Javier J. Salmeron Garcia >Priority: Minor > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > Using Hive 3.1.2, webhcat fails to start in Hadoop 3.3.0 with the following > error: > ``` > javax.servlet.ServletException: Authentication type must be specified: > simple|kerberos > ``` > I tried in Hadoop 3.2.1 with the exact settings and it starts without issues: > > ``` > webhcat: /tmp/hadoop-3.2.1//bin/hadoop jar > /opt/bitnami/hadoop/hive/hcatalog/sbin/../share/webhcat/svr/lib/hive-webhcat-3.1.2.jar > org.apache.hive.hcatalog.templeton.Main > webhcat: starting ... started. > webhcat: done > ``` > > I can provide more logs if needed. Detected authentication settings: > > ``` > hadoop.http.authentication.simple.anonymous.allowed=true > hadoop.http.authentication.type=simple > hadoop.security.authentication=simple > ipc.client.fallback-to-simple-auth-allowed=false > yarn.timeline-service.http-authentication.simple.anonymous.allowed=true > yarn.timeline-service.http-authentication.type=simple > ``` > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (HIVE-26286) Hive WebHCat Tests are failing
[ https://issues.apache.org/jira/browse/HIVE-26286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17583330#comment-17583330 ] Zhiguo Wu edited comment on HIVE-26286 at 8/23/22 4:09 AM: --- I tracked the code, It's seems because class variable _*server*_ is used before it is initialized. {code:java} public Server runServer(int port) throws Exception { // ... ignore some code // !! server will be used inside createChannelConnector() server.addConnector(createChannelConnector()); // Start the server server.start(); // !! server is initialized here this.server = server; return server; } private Connector createChannelConnector() { ServerConnector connector; final HttpConfiguration httpConf = new HttpConfiguration(); httpConf.setRequestHeaderSize(1024 * 64); final HttpConnectionFactory http = new HttpConnectionFactory(httpConf); // !! server will be used here, it passes null to ServerConnector, cause NPE if (conf.getBoolean(AppConfig.USE_SSL, false)) { // ... ignore some codes connector = new ServerConnector(server, sslContextFactory, http); } else { connector = new ServerConnector(server, http); } connector.setReuseAddress(true); connector.setHost(conf.get(AppConfig.HOST, DEFAULT_HOST)); connector.setPort(conf.getInt(AppConfig.PORT, DEFAULT_PORT)); return connector; }{code} was (Author: JIRAUSER294567): I tracked the code, It's seems because class variable _*server*_ is used before it is initialized. {code:java} public Server runServer(int port) throws Exception { ... ignore some code !! server will be used inside createChannelConnector() server.addConnector(createChannelConnector()); // Start the server server.start(); !! server is initialized here this.server = server; return server; } private Connector createChannelConnector() { ServerConnector connector; final HttpConfiguration httpConf = new HttpConfiguration(); httpConf.setRequestHeaderSize(1024 * 64); final HttpConnectionFactory http = new HttpConnectionFactory(httpConf); !! server will be used here, it passes null to ServerConnector, cause NPE if (conf.getBoolean(AppConfig.USE_SSL, false)) { ... ignore some codes connector = new ServerConnector(server, sslContextFactory, http); } else { connector = new ServerConnector(server, http); } connector.setReuseAddress(true); connector.setHost(conf.get(AppConfig.HOST, DEFAULT_HOST)); connector.setPort(conf.getInt(AppConfig.PORT, DEFAULT_PORT)); return connector; }{code} > Hive WebHCat Tests are failing > -- > > Key: HIVE-26286 > URL: https://issues.apache.org/jira/browse/HIVE-26286 > Project: Hive > Issue Type: Bug > Components: HCatalog >Affects Versions: 4.0.0 > Environment: [link title|http://example.com] >Reporter: Anmol Sundaram >Priority: Major > > The Hive TestWebHCatE2e tests seem to be failing due to > > {quote}templeton: Server failed to start: null > [main] ERROR org.apache.hive.hcatalog.templeton.Main - Server failed to > start: > java.lang.NullPointerException > at > org.eclipse.jetty.server.AbstractConnector.(AbstractConnector.java:174) > at > org.eclipse.jetty.server.AbstractNetworkConnector.(AbstractNetworkConnector.java:44) > at org.eclipse.jetty.server.ServerConnector.(ServerConnector.java:220) > at org.eclipse.jetty.server.ServerConnector.(ServerConnector.java:143) > at > org.apache.hive.hcatalog.templeton.Main.createChannelConnector(Main.java:295) > at org.apache.hive.hcatalog.templeton.Main.runServer(Main.java:252) > at org.apache.hive.hcatalog.templeton.Main.run(Main.java:147) > at > org.apache.hive.hcatalog.templeton.TestWebHCatE2e.startHebHcatInMem(TestWebHCatE2e.java:94) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59){quote} > {quote} {quote} > This seems to be caused due to HIVE-18728 , which is breaking. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HIVE-26286) Hive WebHCat Tests are failing
[ https://issues.apache.org/jira/browse/HIVE-26286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17583330#comment-17583330 ] Zhiguo Wu commented on HIVE-26286: -- I tracked the code, It's seems because class variable _*server*_ is used before it is initialized. {code:java} public Server runServer(int port) throws Exception { ... ignore some code !! server will be used inside createChannelConnector() server.addConnector(createChannelConnector()); // Start the server server.start(); !! server is initialized here this.server = server; return server; } private Connector createChannelConnector() { ServerConnector connector; final HttpConfiguration httpConf = new HttpConfiguration(); httpConf.setRequestHeaderSize(1024 * 64); final HttpConnectionFactory http = new HttpConnectionFactory(httpConf); !! server will be used here, it passes null to ServerConnector, cause NPE if (conf.getBoolean(AppConfig.USE_SSL, false)) { ... ignore some codes connector = new ServerConnector(server, sslContextFactory, http); } else { connector = new ServerConnector(server, http); } connector.setReuseAddress(true); connector.setHost(conf.get(AppConfig.HOST, DEFAULT_HOST)); connector.setPort(conf.getInt(AppConfig.PORT, DEFAULT_PORT)); return connector; }{code} > Hive WebHCat Tests are failing > -- > > Key: HIVE-26286 > URL: https://issues.apache.org/jira/browse/HIVE-26286 > Project: Hive > Issue Type: Bug > Components: HCatalog >Affects Versions: 4.0.0 > Environment: [link title|http://example.com] >Reporter: Anmol Sundaram >Priority: Major > > The Hive TestWebHCatE2e tests seem to be failing due to > > {quote}templeton: Server failed to start: null > [main] ERROR org.apache.hive.hcatalog.templeton.Main - Server failed to > start: > java.lang.NullPointerException > at > org.eclipse.jetty.server.AbstractConnector.(AbstractConnector.java:174) > at > org.eclipse.jetty.server.AbstractNetworkConnector.(AbstractNetworkConnector.java:44) > at org.eclipse.jetty.server.ServerConnector.(ServerConnector.java:220) > at org.eclipse.jetty.server.ServerConnector.(ServerConnector.java:143) > at > org.apache.hive.hcatalog.templeton.Main.createChannelConnector(Main.java:295) > at org.apache.hive.hcatalog.templeton.Main.runServer(Main.java:252) > at org.apache.hive.hcatalog.templeton.Main.run(Main.java:147) > at > org.apache.hive.hcatalog.templeton.TestWebHCatE2e.startHebHcatInMem(TestWebHCatE2e.java:94) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59){quote} > {quote} {quote} > This seems to be caused due to HIVE-18728 , which is breaking. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26248) Add data connector authorization on HMS server-side
[ https://issues.apache.org/jira/browse/HIVE-26248?focusedWorklogId=802687&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-802687 ] ASF GitHub Bot logged work on HIVE-26248: - Author: ASF GitHub Bot Created on: 23/Aug/22 03:28 Start Date: 23/Aug/22 03:28 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #3539: URL: https://github.com/apache/hive/pull/3539#issuecomment-1223486606 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=3539) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3539&resolved=false&types=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3539&resolved=false&types=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3539&resolved=false&types=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3539&resolved=false&types=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3539&resolved=false&types=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3539&resolved=false&types=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3539&resolved=false&types=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3539&resolved=false&types=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3539&resolved=false&types=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3539&resolved=false&types=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3539&resolved=false&types=CODE_SMELL) [10 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3539&resolved=false&types=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3539&metric=coverage&view=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3539&metric=duplicated_lines_density&view=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 802687) Time Spent: 3h 50m (was: 3h 40m) > Add data connector authorization on HMS server-side > --- > > Key: HIVE-26248 > URL: https://issues.apache.org/jira/browse/HIVE-26248 > Project: Hive > Issue Type: Sub-task >Affects Versions: 4.0.0-alpha-1, 4.0.0-alpha-2 >Reporter: zhangbutao >Assignee: zhangbutao >Priority: Major > Labels: pull-request-available > Time Spent: 3h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-16913) Support per-session S3 credentials
[ https://issues.apache.org/jira/browse/HIVE-16913?focusedWorklogId=802685&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-802685 ] ASF GitHub Bot logged work on HIVE-16913: - Author: ASF GitHub Bot Created on: 23/Aug/22 03:07 Start Date: 23/Aug/22 03:07 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #3542: URL: https://github.com/apache/hive/pull/3542#issuecomment-1223473805 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=3542) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3542&resolved=false&types=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3542&resolved=false&types=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3542&resolved=false&types=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3542&resolved=false&types=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3542&resolved=false&types=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3542&resolved=false&types=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3542&resolved=false&types=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3542&resolved=false&types=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3542&resolved=false&types=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3542&resolved=false&types=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3542&resolved=false&types=CODE_SMELL) [11 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3542&resolved=false&types=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3542&metric=coverage&view=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3542&metric=duplicated_lines_density&view=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 802685) Time Spent: 20m (was: 10m) > Support per-session S3 credentials > -- > > Key: HIVE-16913 > URL: https://issues.apache.org/jira/browse/HIVE-16913 > Project: Hive > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Currently, the credentials needed to support Hive-on-S3 (or any other > cloud-storage) need to be to the hive-site.xml. Either using a hadoop > credential provider or by adding the keys in the hive-site.xml in plain text > (unsecure) > This limits the usecase to using a single S3 key. If we configure per bucket > s3 keys like described [here | > http://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Configurations_different_S3_buckets] > it exposes the access to all the buckets to all the hive users. > It is possible that there are different sets of users who would not like to > share there buckets an
[jira] [Work logged] (HIVE-25621) Alter table partition compact/concatenate commands should send HivePrivilegeObjects for Authz
[ https://issues.apache.org/jira/browse/HIVE-25621?focusedWorklogId=802679&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-802679 ] ASF GitHub Bot logged work on HIVE-25621: - Author: ASF GitHub Bot Created on: 23/Aug/22 02:24 Start Date: 23/Aug/22 02:24 Worklog Time Spent: 10m Work Description: dengzhhu653 commented on code in PR #2731: URL: https://github.com/apache/hive/pull/2731#discussion_r952090258 ## ql/src/test/results/clientpositive/llap/acid_insert_overwrite_update.q.out: ## @@ -243,16 +247,28 @@ POSTHOOK: Input: default@orc_test_txn@year=2018 LOOKS OKAY PREHOOK: query: alter table orc_test_txn partition(year='2016') compact 'major' PREHOOK: type: ALTERTABLE_COMPACT +PREHOOK: Input: default@orc_test_txn +PREHOOK: Output: default@orc_test_txn POSTHOOK: query: alter table orc_test_txn partition(year='2016') compact 'major' POSTHOOK: type: ALTERTABLE_COMPACT +POSTHOOK: Input: default@orc_test_txn +POSTHOOK: Output: default@orc_test_txn PREHOOK: query: alter table orc_test_txn partition(year='2017') compact 'major' PREHOOK: type: ALTERTABLE_COMPACT +PREHOOK: Input: default@orc_test_txn +PREHOOK: Output: default@orc_test_txn POSTHOOK: query: alter table orc_test_txn partition(year='2017') compact 'major' POSTHOOK: type: ALTERTABLE_COMPACT +POSTHOOK: Input: default@orc_test_txn +POSTHOOK: Output: default@orc_test_txn PREHOOK: query: alter table orc_test_txn partition(year='2018') compact 'major' PREHOOK: type: ALTERTABLE_COMPACT +PREHOOK: Input: default@orc_test_txn Review Comment: why does partition 'year=2018' not present in the Input/Output? Issue Time Tracking --- Worklog Id: (was: 802679) Time Spent: 2h 10m (was: 2h) > Alter table partition compact/concatenate commands should send > HivePrivilegeObjects for Authz > - > > Key: HIVE-25621 > URL: https://issues.apache.org/jira/browse/HIVE-25621 > Project: Hive > Issue Type: Bug >Affects Versions: 4.0.0 >Reporter: Sai Hemanth Gantasala >Assignee: Sai Hemanth Gantasala >Priority: Major > Labels: pull-request-available > Time Spent: 2h 10m > Remaining Estimate: 0h > > # Run the following queries > Create table temp(c0 int) partitioned by (c1 int); > Insert into temp values(1,1); > ALTER TABLE temp PARTITION (c1=1) COMPACT 'minor'; > ALTER TABLE temp PARTITION (c1=1) CONCATENATE; > Insert into temp values(1,1); > # The above compact/concatenate commands are currently not sending any hive > privilege objects for authorization. Hive needs to send these objects to > avoid malicious users doing any operation. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HIVE-16913) Support per-session S3 credentials
[ https://issues.apache.org/jira/browse/HIVE-16913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17583295#comment-17583295 ] zhangbutao commented on HIVE-16913: --- [~vihangk1] I hope you don't mind and I created a PR for this ticket, please take a look if you have free time. Thx. > Support per-session S3 credentials > -- > > Key: HIVE-16913 > URL: https://issues.apache.org/jira/browse/HIVE-16913 > Project: Hive > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Currently, the credentials needed to support Hive-on-S3 (or any other > cloud-storage) need to be to the hive-site.xml. Either using a hadoop > credential provider or by adding the keys in the hive-site.xml in plain text > (unsecure) > This limits the usecase to using a single S3 key. If we configure per bucket > s3 keys like described [here | > http://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Configurations_different_S3_buckets] > it exposes the access to all the buckets to all the hive users. > It is possible that there are different sets of users who would not like to > share there buckets and still be able to process the data using Hive. > Enabling session level credentials will help solve such use-cases. For > example, currently this doesn't work > {noformat} > set fs.s3a.secret.key=my_secret_key; > set fs.s3a.access.key=my_access.key; > {noformat} > Because metastore is unaware of the the keys. This doesn't work either > {noformat} > set fs.s3a.secret.key=my_secret_key; > set fs.s3a.access.key=my_access.key; > set metaconf:fs.s3a.secret.key=my_secret_key; > set metaconf:fs.s3a.access.key=my_access_key; > {noformat} > This is because only a certain metastore configurations defined in > {{HiveConf.MetaVars}} are allowed to be set by the user. If we enable the > above approaches we could potentially allow multiple S3 credentials on a > per-session level basis. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-25621) Alter table partition compact/concatenate commands should send HivePrivilegeObjects for Authz
[ https://issues.apache.org/jira/browse/HIVE-25621?focusedWorklogId=802674&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-802674 ] ASF GitHub Bot logged work on HIVE-25621: - Author: ASF GitHub Bot Created on: 23/Aug/22 02:14 Start Date: 23/Aug/22 02:14 Worklog Time Spent: 10m Work Description: dengzhhu653 commented on code in PR #2731: URL: https://github.com/apache/hive/pull/2731#discussion_r952081667 ## ql/src/java/org/apache/hadoop/hive/ql/ddl/table/storage/compact/AlterTableCompactAnalyzer.java: ## @@ -27,7 +27,13 @@ import org.apache.hadoop.hive.ql.ddl.DDLWork; import org.apache.hadoop.hive.ql.ddl.DDLSemanticAnalyzerFactory.DDLType; import org.apache.hadoop.hive.ql.ddl.table.AbstractAlterTableAnalyzer; +import org.apache.hadoop.hive.ql.ddl.table.AlterTableType; import org.apache.hadoop.hive.ql.exec.TaskFactory; +import org.apache.hadoop.hive.ql.hooks.ReadEntity; Review Comment: nit: worthless imports Issue Time Tracking --- Worklog Id: (was: 802674) Time Spent: 2h (was: 1h 50m) > Alter table partition compact/concatenate commands should send > HivePrivilegeObjects for Authz > - > > Key: HIVE-25621 > URL: https://issues.apache.org/jira/browse/HIVE-25621 > Project: Hive > Issue Type: Bug >Affects Versions: 4.0.0 >Reporter: Sai Hemanth Gantasala >Assignee: Sai Hemanth Gantasala >Priority: Major > Labels: pull-request-available > Time Spent: 2h > Remaining Estimate: 0h > > # Run the following queries > Create table temp(c0 int) partitioned by (c1 int); > Insert into temp values(1,1); > ALTER TABLE temp PARTITION (c1=1) COMPACT 'minor'; > ALTER TABLE temp PARTITION (c1=1) CONCATENATE; > Insert into temp values(1,1); > # The above compact/concatenate commands are currently not sending any hive > privilege objects for authorization. Hive needs to send these objects to > avoid malicious users doing any operation. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-25621) Alter table partition compact/concatenate commands should send HivePrivilegeObjects for Authz
[ https://issues.apache.org/jira/browse/HIVE-25621?focusedWorklogId=802673&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-802673 ] ASF GitHub Bot logged work on HIVE-25621: - Author: ASF GitHub Bot Created on: 23/Aug/22 02:13 Start Date: 23/Aug/22 02:13 Worklog Time Spent: 10m Work Description: dengzhhu653 commented on code in PR #2731: URL: https://github.com/apache/hive/pull/2731#discussion_r952081352 ## ql/src/java/org/apache/hadoop/hive/ql/ddl/table/storage/compact/AlterTableCompactAnalyzer.java: ## @@ -67,6 +73,18 @@ protected void analyzeCommand(TableName tableName, Map partition } AlterTableCompactDesc desc = new AlterTableCompactDesc(tableName, partitionSpec, type, isBlocking, mapProp); +//Table table = getTable(tableName); Review Comment: nit: can we remove these comments as well? Issue Time Tracking --- Worklog Id: (was: 802673) Time Spent: 1h 50m (was: 1h 40m) > Alter table partition compact/concatenate commands should send > HivePrivilegeObjects for Authz > - > > Key: HIVE-25621 > URL: https://issues.apache.org/jira/browse/HIVE-25621 > Project: Hive > Issue Type: Bug >Affects Versions: 4.0.0 >Reporter: Sai Hemanth Gantasala >Assignee: Sai Hemanth Gantasala >Priority: Major > Labels: pull-request-available > Time Spent: 1h 50m > Remaining Estimate: 0h > > # Run the following queries > Create table temp(c0 int) partitioned by (c1 int); > Insert into temp values(1,1); > ALTER TABLE temp PARTITION (c1=1) COMPACT 'minor'; > ALTER TABLE temp PARTITION (c1=1) CONCATENATE; > Insert into temp values(1,1); > # The above compact/concatenate commands are currently not sending any hive > privilege objects for authorization. Hive needs to send these objects to > avoid malicious users doing any operation. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-16913) Support per-session S3 credentials
[ https://issues.apache.org/jira/browse/HIVE-16913?focusedWorklogId=802672&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-802672 ] ASF GitHub Bot logged work on HIVE-16913: - Author: ASF GitHub Bot Created on: 23/Aug/22 02:10 Start Date: 23/Aug/22 02:10 Worklog Time Spent: 10m Work Description: zhangbutao opened a new pull request, #3542: URL: https://github.com/apache/hive/pull/3542 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? Issue Time Tracking --- Worklog Id: (was: 802672) Remaining Estimate: 0h Time Spent: 10m > Support per-session S3 credentials > -- > > Key: HIVE-16913 > URL: https://issues.apache.org/jira/browse/HIVE-16913 > Project: Hive > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Currently, the credentials needed to support Hive-on-S3 (or any other > cloud-storage) need to be to the hive-site.xml. Either using a hadoop > credential provider or by adding the keys in the hive-site.xml in plain text > (unsecure) > This limits the usecase to using a single S3 key. If we configure per bucket > s3 keys like described [here | > http://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Configurations_different_S3_buckets] > it exposes the access to all the buckets to all the hive users. > It is possible that there are different sets of users who would not like to > share there buckets and still be able to process the data using Hive. > Enabling session level credentials will help solve such use-cases. For > example, currently this doesn't work > {noformat} > set fs.s3a.secret.key=my_secret_key; > set fs.s3a.access.key=my_access.key; > {noformat} > Because metastore is unaware of the the keys. This doesn't work either > {noformat} > set fs.s3a.secret.key=my_secret_key; > set fs.s3a.access.key=my_access.key; > set metaconf:fs.s3a.secret.key=my_secret_key; > set metaconf:fs.s3a.access.key=my_access_key; > {noformat} > This is because only a certain metastore configurations defined in > {{HiveConf.MetaVars}} are allowed to be set by the user. If we enable the > above approaches we could potentially allow multiple S3 credentials on a > per-session level basis. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-16913) Support per-session S3 credentials
[ https://issues.apache.org/jira/browse/HIVE-16913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-16913: -- Labels: pull-request-available (was: ) > Support per-session S3 credentials > -- > > Key: HIVE-16913 > URL: https://issues.apache.org/jira/browse/HIVE-16913 > Project: Hive > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Currently, the credentials needed to support Hive-on-S3 (or any other > cloud-storage) need to be to the hive-site.xml. Either using a hadoop > credential provider or by adding the keys in the hive-site.xml in plain text > (unsecure) > This limits the usecase to using a single S3 key. If we configure per bucket > s3 keys like described [here | > http://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Configurations_different_S3_buckets] > it exposes the access to all the buckets to all the hive users. > It is possible that there are different sets of users who would not like to > share there buckets and still be able to process the data using Hive. > Enabling session level credentials will help solve such use-cases. For > example, currently this doesn't work > {noformat} > set fs.s3a.secret.key=my_secret_key; > set fs.s3a.access.key=my_access.key; > {noformat} > Because metastore is unaware of the the keys. This doesn't work either > {noformat} > set fs.s3a.secret.key=my_secret_key; > set fs.s3a.access.key=my_access.key; > set metaconf:fs.s3a.secret.key=my_secret_key; > set metaconf:fs.s3a.access.key=my_access_key; > {noformat} > This is because only a certain metastore configurations defined in > {{HiveConf.MetaVars}} are allowed to be set by the user. If we enable the > above approaches we could potentially allow multiple S3 credentials on a > per-session level basis. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26353) Http auth header for NEGOTIATE is not standard
[ https://issues.apache.org/jira/browse/HIVE-26353?focusedWorklogId=802660&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-802660 ] ASF GitHub Bot logged work on HIVE-26353: - Author: ASF GitHub Bot Created on: 23/Aug/22 00:27 Start Date: 23/Aug/22 00:27 Worklog Time Spent: 10m Work Description: github-actions[bot] commented on PR #3403: URL: https://github.com/apache/hive/pull/3403#issuecomment-1223364705 This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Feel free to reach out on the d...@hive.apache.org list if the patch is in need of reviews. Issue Time Tracking --- Worklog Id: (was: 802660) Time Spent: 20m (was: 10m) > Http auth header for NEGOTIATE is not standard > -- > > Key: HIVE-26353 > URL: https://issues.apache.org/jira/browse/HIVE-26353 > Project: Hive > Issue Type: Bug >Reporter: feiwang >Priority: Minor > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > The auth header for http spnego is not standard. > The code link: > https://github.com/apache/hive/blob/7b3ecf617a6d46f48a3b6f77e0339fd4ad95a420/jdbc/src/java/org/apache/hive/jdbc/HttpKerberosRequestInterceptor.java#L58-L65 > {code:java} > @Override > protected void addHttpAuthHeader(HttpRequest httpRequest, HttpContext > httpContext) throws Exception { > try { > // Generate the service ticket for sending to the server. > // Locking ensures the tokens are unique in case of concurrent requests > kerberosLock.lock(); > String kerberosAuthHeader = > HttpAuthUtils.getKerberosServiceTicket(principal, host, serverHttpUrl, > loggedInSubject); > // Set the session key token (Base64 encoded) in the headers > httpRequest.addHeader(HttpAuthUtils.AUTHORIZATION + ": " + > HttpAuthUtils.NEGOTIATE + " ", kerberosAuthHeader); > } catch (Exception e) { > throw new HttpException(e.getMessage(), e); > } finally { > kerberosLock.unlock(); > } > } > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-24776) Reduce HMS DB calls during stats updates
[ https://issues.apache.org/jira/browse/HIVE-24776?focusedWorklogId=802661&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-802661 ] ASF GitHub Bot logged work on HIVE-24776: - Author: ASF GitHub Bot Created on: 23/Aug/22 00:27 Start Date: 23/Aug/22 00:27 Worklog Time Spent: 10m Work Description: github-actions[bot] commented on PR #3400: URL: https://github.com/apache/hive/pull/3400#issuecomment-1223364726 This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Feel free to reach out on the d...@hive.apache.org list if the patch is in need of reviews. Issue Time Tracking --- Worklog Id: (was: 802661) Time Spent: 1h 40m (was: 1.5h) > Reduce HMS DB calls during stats updates > > > Key: HIVE-24776 > URL: https://issues.apache.org/jira/browse/HIVE-24776 > Project: Hive > Issue Type: Improvement >Reporter: Rajesh Balamohan >Assignee: Harshit Gupta >Priority: Major > Labels: pull-request-available > Time Spent: 1h 40m > Remaining Estimate: 0h > > When adding large number of partitions (100s/1000s) in a table, it ends up > making lots of getTable calls which are not needed. > Lines mentioned below may vary slightly in apache-master. > {noformat} > at > org.datanucleus.api.jdo.JDOPersistenceManager.jdoRetrieve(JDOPersistenceManager.java:620) > at > org.datanucleus.api.jdo.JDOPersistenceManager.retrieve(JDOPersistenceManager.java:637) > at > org.datanucleus.api.jdo.JDOPersistenceManager.retrieve(JDOPersistenceManager.java:646) > at > org.apache.hadoop.hive.metastore.ObjectStore.getMTable(ObjectStore.java:2112) > at > org.apache.hadoop.hive.metastore.ObjectStore.getMTable(ObjectStore.java:2150) > at > org.apache.hadoop.hive.metastore.ObjectStore.ensureGetMTable(ObjectStore.java:4578) > at > org.apache.hadoop.hive.metastore.ObjectStore.ensureGetTable(ObjectStore.java:4588) > at > org.apache.hadoop.hive.metastore.ObjectStore.updatePartitionColumnStatistics(ObjectStore.java:9264) > at sun.reflect.GeneratedMethodAccessor92.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97) > at com.sun.proxy.$Proxy27.updatePartitionColumnStatistics(Unknown > Source) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.updatePartitonColStatsInternal(HiveMetaStore.java:6679) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.updatePartColumnStatsWithMerge(HiveMetaStore.java:8655) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.set_aggr_stats_for(HiveMetaStore.java:8592) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108) > at com.sun.proxy.$Proxy28.set_aggr_stats_for(Unknown Source) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$set_aggr_stats_for.getResult(ThriftHiveMetastore.java:19060) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$set_aggr_stats_for.getResult(ThriftHiveMetastore.java:19044) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26314) Support alter function in Hive DDL
[ https://issues.apache.org/jira/browse/HIVE-26314?focusedWorklogId=802662&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-802662 ] ASF GitHub Bot logged work on HIVE-26314: - Author: ASF GitHub Bot Created on: 23/Aug/22 00:27 Start Date: 23/Aug/22 00:27 Worklog Time Spent: 10m Work Description: github-actions[bot] commented on PR #3360: URL: https://github.com/apache/hive/pull/3360#issuecomment-1223364773 This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Feel free to reach out on the d...@hive.apache.org list if the patch is in need of reviews. Issue Time Tracking --- Worklog Id: (was: 802662) Time Spent: 1.5h (was: 1h 20m) > Support alter function in Hive DDL > -- > > Key: HIVE-26314 > URL: https://issues.apache.org/jira/browse/HIVE-26314 > Project: Hive > Issue Type: Task > Components: Hive >Affects Versions: 4.0.0-alpha-1 >Reporter: Wechar >Assignee: Wechar >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0-alpha-2 > > Time Spent: 1.5h > Remaining Estimate: 0h > > Hive SQL does not support {{*ALTER FUNCTION*}} yet, we can refer to the > {{*CREATE [OR REPLACE] FUNCTION*}} of > [Spark|https://spark.apache.org/docs/3.1.2/sql-ref-syntax-ddl-create-function.html] > to implement the alter function . > {code:sql} > CREATE [ TEMPORARY ] FUNCTION [ OR REPLACE ] [IF NOT EXISTS ] > [db_name.]function_name AS class_name > [USING JAR|FILE|ARCHIVE 'file_uri' [, JAR|FILE|ARCHIVE 'file_uri'] ]; > {code} > * *OR REPLACE* > If specified, the resources for the function are reloaded. This is mainly > useful to pick up any changes made to the implementation of the function. > This parameter is mutually exclusive to {{*IF NOT EXISTS*}} and can not be > specified together. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26472) Concurrent UPDATEs can cause duplicate rows
[ https://issues.apache.org/jira/browse/HIVE-26472?focusedWorklogId=802650&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-802650 ] ASF GitHub Bot logged work on HIVE-26472: - Author: ASF GitHub Bot Created on: 22/Aug/22 23:18 Start Date: 22/Aug/22 23:18 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #3524: URL: https://github.com/apache/hive/pull/3524#issuecomment-1223312549 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=3524) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3524&resolved=false&types=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3524&resolved=false&types=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3524&resolved=false&types=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3524&resolved=false&types=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3524&resolved=false&types=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3524&resolved=false&types=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3524&resolved=false&types=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3524&resolved=false&types=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3524&resolved=false&types=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3524&resolved=false&types=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3524&resolved=false&types=CODE_SMELL) [16 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3524&resolved=false&types=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3524&metric=coverage&view=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3524&metric=duplicated_lines_density&view=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 802650) Time Spent: 1h 50m (was: 1h 40m) > Concurrent UPDATEs can cause duplicate rows > --- > > Key: HIVE-26472 > URL: https://issues.apache.org/jira/browse/HIVE-26472 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 4.0.0-alpha-1 >Reporter: John Sherman >Assignee: John Sherman >Priority: Critical > Labels: pull-request-available > Attachments: debug.diff > > Time Spent: 1h 50m > Remaining Estimate: 0h > > Concurrent UPDATEs to the same table can cause duplicate rows when the > following occurs: > Two UPDATEs get assigned txnIds and writeIds like this: > UPDATE #1 = txnId: 100 writeId: 50 <--- commits first > UPDATE #2 = txnId: 101 writeId: 49 > To replicate the issue: > I applied the attach debug.diff patch which adds hive.lock.sleep.writeid > (which controls the amount to sleep before acquiring a writeId) and > hive.lock.sleep.post.writeid (which controls the amount to sleep after > acquiring a writeId). > {code
[jira] [Work logged] (HIVE-26464) New credential provider for replicating to the cloud
[ https://issues.apache.org/jira/browse/HIVE-26464?focusedWorklogId=802590&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-802590 ] ASF GitHub Bot logged work on HIVE-26464: - Author: ASF GitHub Bot Created on: 22/Aug/22 18:42 Start Date: 22/Aug/22 18:42 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #3526: URL: https://github.com/apache/hive/pull/3526#issuecomment-1222770778 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=3526) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3526&resolved=false&types=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3526&resolved=false&types=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3526&resolved=false&types=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3526&resolved=false&types=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3526&resolved=false&types=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3526&resolved=false&types=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3526&resolved=false&types=SECURITY_HOTSPOT) [![E](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/E-16px.png 'E')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3526&resolved=false&types=SECURITY_HOTSPOT) [2 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3526&resolved=false&types=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3526&resolved=false&types=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3526&resolved=false&types=CODE_SMELL) [11 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3526&resolved=false&types=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3526&metric=coverage&view=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3526&metric=duplicated_lines_density&view=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 802590) Time Spent: 40m (was: 0.5h) > New credential provider for replicating to the cloud > > > Key: HIVE-26464 > URL: https://issues.apache.org/jira/browse/HIVE-26464 > Project: Hive > Issue Type: Task > Components: HiveServer2, repl >Reporter: Peter Felker >Assignee: Peter Felker >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > In {{ReplDumpTask}}, if the following *new* config is provided in > {{HiveConf}}: > * {{hive.repl.cloud.credential.provider.path}} > then the HS2 credstore URI scheme, contained by {{HiveConf}} with key > {{hadoop.security.credential.provider.path}}, should be updated so that it > will start with new scheme: {{hiverepljceks}}. For instance: > {code}jceks://file/path/to/credstore/creds.localjceks{code} > will become: > {code}hiverepljceks://file/path/to/credstore/creds.localjceks{code} > This new scheme, {{hiverepljceks}}, will make Hadoop to use a *new* > credential pro
[jira] [Work logged] (HIVE-25621) Alter table partition compact/concatenate commands should send HivePrivilegeObjects for Authz
[ https://issues.apache.org/jira/browse/HIVE-25621?focusedWorklogId=802540&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-802540 ] ASF GitHub Bot logged work on HIVE-25621: - Author: ASF GitHub Bot Created on: 22/Aug/22 16:24 Start Date: 22/Aug/22 16:24 Worklog Time Spent: 10m Work Description: saihemanth-cloudera commented on PR #2731: URL: https://github.com/apache/hive/pull/2731#issuecomment-1222596505 @dengzhhu653 - Can you please review my patch again? Thanks Issue Time Tracking --- Worklog Id: (was: 802540) Time Spent: 1h 40m (was: 1.5h) > Alter table partition compact/concatenate commands should send > HivePrivilegeObjects for Authz > - > > Key: HIVE-25621 > URL: https://issues.apache.org/jira/browse/HIVE-25621 > Project: Hive > Issue Type: Bug >Affects Versions: 4.0.0 >Reporter: Sai Hemanth Gantasala >Assignee: Sai Hemanth Gantasala >Priority: Major > Labels: pull-request-available > Time Spent: 1h 40m > Remaining Estimate: 0h > > # Run the following queries > Create table temp(c0 int) partitioned by (c1 int); > Insert into temp values(1,1); > ALTER TABLE temp PARTITION (c1=1) COMPACT 'minor'; > ALTER TABLE temp PARTITION (c1=1) CONCATENATE; > Insert into temp values(1,1); > # The above compact/concatenate commands are currently not sending any hive > privilege objects for authorization. Hive needs to send these objects to > avoid malicious users doing any operation. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-24933) Replication fails for transactional tables having same name as dropped non-transactional table
[ https://issues.apache.org/jira/browse/HIVE-24933?focusedWorklogId=802534&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-802534 ] ASF GitHub Bot logged work on HIVE-24933: - Author: ASF GitHub Bot Created on: 22/Aug/22 16:13 Start Date: 22/Aug/22 16:13 Worklog Time Spent: 10m Work Description: cmunkey commented on code in PR #3435: URL: https://github.com/apache/hive/pull/3435#discussion_r951627733 ## ql/src/java/org/apache/hadoop/hive/ql/plan/DeferredWorkHelper.java: ## @@ -0,0 +1,65 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hive.ql.plan; + +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.hive.metastore.ReplChangeManager; +import org.apache.hadoop.hive.metastore.utils.MetaStoreUtils; +import org.apache.hadoop.hive.ql.Context; +import org.apache.hadoop.hive.ql.exec.Utilities; +import org.apache.hadoop.hive.ql.io.AcidUtils; +import org.apache.hadoop.hive.ql.metadata.Hive; +import org.apache.hadoop.hive.ql.metadata.HiveException; +import org.apache.hadoop.hive.ql.metadata.Table; +import org.apache.hadoop.hive.ql.parse.ImportSemanticAnalyzer; + +import java.util.Collections; +import java.util.TreeMap; + +public class DeferredWorkHelper { Review Comment: Call this DeferredContext since it does not do anything anymore. The deferred context is used to make decisions at deferred time. Issue Time Tracking --- Worklog Id: (was: 802534) Time Spent: 4h 10m (was: 4h) > Replication fails for transactional tables having same name as dropped > non-transactional table > -- > > Key: HIVE-24933 > URL: https://issues.apache.org/jira/browse/HIVE-24933 > Project: Hive > Issue Type: Bug >Reporter: Pratyush Madhukar >Assignee: Pratyush Madhukar >Priority: Major > Labels: pull-request-available > Time Spent: 4h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26482) Create a unit test checking compaction output file names on a partitioned table
[ https://issues.apache.org/jira/browse/HIVE-26482?focusedWorklogId=802533&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-802533 ] ASF GitHub Bot logged work on HIVE-26482: - Author: ASF GitHub Bot Created on: 22/Aug/22 16:10 Start Date: 22/Aug/22 16:10 Worklog Time Spent: 10m Work Description: veghlaci05 commented on code in PR #3532: URL: https://github.com/apache/hive/pull/3532#discussion_r951624999 ## ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java: ## @@ -3264,6 +3264,37 @@ public void testNoTxnComponentsForScheduledQueries() throws Exception { Assert.assertEquals(resData, stringifyValues(actualData)); } + @Test + public void testCompactionOutputDirectoryNamesOnPartitions() throws Exception { +String p1 = "p=p1"; +String p2 = "p=p2"; +String expectedDelta1 = p1 + "/delta_001_002_v021"; +String expectedDelta2 = p2 + "/delta_003_004_v022"; + +runStatementOnDriver("insert into " + Table.ACIDTBLPART + " partition(p='p1') (a,b) values(1,2)"); +runStatementOnDriver("insert into " + Table.ACIDTBLPART + " partition(p='p1') (a,b) values(3,4)"); +runStatementOnDriver("insert into " + Table.ACIDTBLPART + " partition(p='p2') (a,b) values(1,2)"); +runStatementOnDriver("insert into " + Table.ACIDTBLPART + " partition(p='p2') (a,b) values(3,4)"); + +compactPartition(Table.ACIDTBLPART.name().toLowerCase(), CompactionType.MINOR, p1); +compactPartition(Table.ACIDTBLPART.name().toLowerCase(), CompactionType.MINOR, p2); + +FileSystem fs = FileSystem.get(hiveConf); +String tablePath = getWarehouseDir() + "/" + Table.ACIDTBLPART.name().toLowerCase() + "/"; + +Assert.assertTrue(fs.exists(new Path(tablePath + expectedDelta1))); +Assert.assertTrue(fs.exists(new Path(tablePath + expectedDelta2))); Review Comment: Maybe adding an extra assertion, that there are no other folders on the filesystem? Issue Time Tracking --- Worklog Id: (was: 802533) Time Spent: 40m (was: 0.5h) > Create a unit test checking compaction output file names on a partitioned > table > --- > > Key: HIVE-26482 > URL: https://issues.apache.org/jira/browse/HIVE-26482 > Project: Hive > Issue Type: Test > Components: Hive >Reporter: Zsolt Miskolczi >Assignee: Zsolt Miskolczi >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > Compaction output directories' writeIds only reflect the writeIds of the > deltas it compacts, and not the max write id of the table > Example: > Pre-compaction... > {code:java} > Partition p=1 contains: > delta_1_1 > delta_2_2 > partition p=2 contains > delta_3_3 > delta_4_4 > {code} > After minor compaction... > {code:java} > Partition p=1 contains: > delta_1_2 > partition p=2 contains > delta_3_4 > {code} > AFAIK there are no unit tests that reflect this. > TestTxnCommands2#testFullACIDAbortWithManyPartitions is a good template to > start with. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26472) Concurrent UPDATEs can cause duplicate rows
[ https://issues.apache.org/jira/browse/HIVE-26472?focusedWorklogId=802521&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-802521 ] ASF GitHub Bot logged work on HIVE-26472: - Author: ASF GitHub Bot Created on: 22/Aug/22 15:46 Start Date: 22/Aug/22 15:46 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #3524: URL: https://github.com/apache/hive/pull/3524#issuecomment-1222541668 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=3524) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3524&resolved=false&types=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3524&resolved=false&types=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3524&resolved=false&types=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3524&resolved=false&types=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3524&resolved=false&types=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3524&resolved=false&types=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3524&resolved=false&types=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3524&resolved=false&types=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3524&resolved=false&types=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3524&resolved=false&types=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3524&resolved=false&types=CODE_SMELL) [16 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3524&resolved=false&types=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3524&metric=coverage&view=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3524&metric=duplicated_lines_density&view=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 802521) Time Spent: 1h 40m (was: 1.5h) > Concurrent UPDATEs can cause duplicate rows > --- > > Key: HIVE-26472 > URL: https://issues.apache.org/jira/browse/HIVE-26472 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 4.0.0-alpha-1 >Reporter: John Sherman >Assignee: John Sherman >Priority: Critical > Labels: pull-request-available > Attachments: debug.diff > > Time Spent: 1h 40m > Remaining Estimate: 0h > > Concurrent UPDATEs to the same table can cause duplicate rows when the > following occurs: > Two UPDATEs get assigned txnIds and writeIds like this: > UPDATE #1 = txnId: 100 writeId: 50 <--- commits first > UPDATE #2 = txnId: 101 writeId: 49 > To replicate the issue: > I applied the attach debug.diff patch which adds hive.lock.sleep.writeid > (which controls the amount to sleep before acquiring a writeId) and > hive.lock.sleep.post.writeid (which controls the amount to sleep after > acquiring a writeId). > {code:j
[jira] [Work logged] (HIVE-26443) Add priority queueing to compaction
[ https://issues.apache.org/jira/browse/HIVE-26443?focusedWorklogId=802494&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-802494 ] ASF GitHub Bot logged work on HIVE-26443: - Author: ASF GitHub Bot Created on: 22/Aug/22 14:47 Start Date: 22/Aug/22 14:47 Worklog Time Spent: 10m Work Description: veghlaci05 commented on code in PR #3513: URL: https://github.com/apache/hive/pull/3513#discussion_r951533741 ## ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java: ## @@ -171,6 +176,7 @@ public void run() { for (CompactionInfo ci : potentials) { try { + Database d = resolveDatabaseAndCache(ci); Review Comment: Only the pool can be set on DB level. The code is not looking for any other properties, like `no_auto_compaction`. The DB level pool assignment was a specific requirement from Janos, to make it easier to assign a whole DB to some dedicated pool. Issue Time Tracking --- Worklog Id: (was: 802494) Time Spent: 6h 10m (was: 6h) > Add priority queueing to compaction > --- > > Key: HIVE-26443 > URL: https://issues.apache.org/jira/browse/HIVE-26443 > Project: Hive > Issue Type: New Feature >Reporter: László Végh >Assignee: László Végh >Priority: Major > Labels: pull-request-available > Attachments: Pool based compaction queues.docx > > Time Spent: 6h 10m > Remaining Estimate: 0h > > The details can be found in the attached design doc. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26456) Remove stringifyException Method From Storage Handlers
[ https://issues.apache.org/jira/browse/HIVE-26456?focusedWorklogId=802489&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-802489 ] ASF GitHub Bot logged work on HIVE-26456: - Author: ASF GitHub Bot Created on: 22/Aug/22 14:32 Start Date: 22/Aug/22 14:32 Worklog Time Spent: 10m Work Description: zabetak commented on code in PR #3506: URL: https://github.com/apache/hive/pull/3506#discussion_r951498152 ## accumulo-handler/src/test/org/apache/hadoop/hive/accumulo/predicate/TestAccumuloPredicateHandler.java: ## @@ -386,7 +386,7 @@ public void testPushdownComparisonOptNotSupported() { } catch (RuntimeException e) { assertTrue(e.getMessage().contains("Unexpected residual predicate: field1 is not null")); } catch (Exception e) { - fail(StringUtils.stringifyException(e)); + fail(e.getMessage()); } Review Comment: Remove unused `import org.apache.hadoop.util.StringUtils;` ## accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/AccumuloStorageHandler.java: ## @@ -369,11 +369,7 @@ public void preCreateTable(Table table) throws MetaException { tableOpts.create(idxTable); } } -} catch (AccumuloSecurityException e) { - throw new MetaException(StringUtils.stringifyException(e)); -} catch (TableExistsException e) { - throw new MetaException(StringUtils.stringifyException(e)); -} catch (AccumuloException e) { +} catch (AccumuloSecurityException | TableExistsException | AccumuloException e) { Review Comment: In a previous PR (https://github.com/apache/hive/pull/3478) we opted to use the LOG + throw pattern as an alternative of removing `stringifyException` acknowledging advantages and disadvantages of it. Why we don't want to do the same here? ## accumulo-handler/src/test/org/apache/hadoop/hive/accumulo/predicate/TestAccumuloPredicateHandler.java: ## @@ -386,7 +386,7 @@ public void testPushdownComparisonOptNotSupported() { } catch (RuntimeException e) { assertTrue(e.getMessage().contains("Unexpected residual predicate: field1 is not null")); } catch (Exception e) { - fail(StringUtils.stringifyException(e)); + fail(e.getMessage()); } Review Comment: It seems we don't really need this catch. ## accumulo-handler/src/test/org/apache/hadoop/hive/accumulo/predicate/TestAccumuloPredicateHandler.java: ## @@ -475,7 +474,7 @@ public void testIgnoreIteratorPushdown() throws TooManyAccumuloColumnsException List iterators = handler.getIterators(conf, columnMapper); assertEquals(iterators.size(), 0); } catch (Exception e) { - fail(StringUtils.stringifyException(e)); + fail(e.getMessage()); } Review Comment: Same comment as before. ## druid-handler/src/test/org/apache/hadoop/hive/druid/QTestDruidSerDe.java: ## @@ -96,7 +96,7 @@ public class QTestDruidSerDe extends DruidSerDe { DruidStorageHandlerUtils.JSON_MAPPER.readValue(RESPONSE, new TypeReference>() { }); } catch (Exception e) { - throw new SerDeException(StringUtils.stringifyException(e)); + throw new SerDeException(e); Review Comment: Remove unused `import org.apache.hadoop.util.StringUtils;` ## accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/mr/HiveAccumuloTableInputFormat.java: ## @@ -175,15 +175,8 @@ public InputSplit[] getSplits(JobConf jobConf, int numSplits) throws IOException } return hiveSplits; -} catch (AccumuloException e) { - log.error("Could not configure AccumuloInputFormat", e); - throw new IOException(StringUtils.stringifyException(e)); -} catch (AccumuloSecurityException e) { - log.error("Could not configure AccumuloInputFormat", e); - throw new IOException(StringUtils.stringifyException(e)); -} catch (SerDeException e) { - log.error("Could not configure AccumuloInputFormat", e); - throw new IOException(StringUtils.stringifyException(e)); +} catch (AccumuloException | AccumuloSecurityException | SerDeException e) { + throw new IOException("Could not configure AccumuloInputFormat", e); Review Comment: Remove unused `import org.apache.hadoop.util.StringUtils;` ## accumulo-handler/src/test/org/apache/hadoop/hive/accumulo/predicate/TestAccumuloPredicateHandler.java: ## @@ -423,7 +423,6 @@ public void testIteratorIgnoreRowIDFields() { List iterators = handler.getIterators(conf, columnMapper); assertEquals(iterators.size(), 0); } catch (SerDeException e) { - StringUtils.stringifyException(e); } Review Comment: Ignoring the exception seems wrong. Most likely the test should fail in case of exception. Can we fix this as part of this PR? Issue Time Tracking ---
[jira] [Updated] (HIVE-26492) orc.apache.orc.impl.writer.StructTreeWriter write wrong column stats when encounter decimal type value
[ https://issues.apache.org/jira/browse/HIVE-26492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yunhong Zheng updated HIVE-26492: - Description: For _orc.apache.orc.impl.writer.StructTreeWriter_ in hive-exec-3.1.1 . In this class, method _writeFileStatistics_ will create a wrong column stats min value (column stats min equals 0) when encounter decimal type column(like decimal(5, 2), decimal(14, 2)). the debug screenshot: !image-2022-08-22-22-05-53-412.png|width=822,height=249! was: For _orc.apache.orc.impl.writer.StructTreeWriter_ in hive-exec-3.1.1 . In this class, method _writeFileStatistics_ will create a wrong column stats min value (column stats min equals 0) when encounter decimal type column(like decimal(5, 2), decimal(14, 2)). the debug screenshot: !image-2022-08-22-21-06-29-980.png|width=405,height=280! > orc.apache.orc.impl.writer.StructTreeWriter write wrong column stats when > encounter decimal type value > -- > > Key: HIVE-26492 > URL: https://issues.apache.org/jira/browse/HIVE-26492 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 3.1.1 >Reporter: Yunhong Zheng >Priority: Major > Fix For: 3.1.1 > > Attachments: image-2022-08-22-22-05-53-412.png > > > For _orc.apache.orc.impl.writer.StructTreeWriter_ in hive-exec-3.1.1 . In > this class, method _writeFileStatistics_ will create a wrong column stats min > value (column stats min equals 0) when encounter decimal type column(like > decimal(5, 2), decimal(14, 2)). the debug screenshot: > !image-2022-08-22-22-05-53-412.png|width=822,height=249! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-26492) orc.apache.orc.impl.writer.StructTreeWriter write wrong column stats when encounter decimal type value
[ https://issues.apache.org/jira/browse/HIVE-26492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yunhong Zheng updated HIVE-26492: - Attachment: image-2022-08-22-22-05-53-412.png > orc.apache.orc.impl.writer.StructTreeWriter write wrong column stats when > encounter decimal type value > -- > > Key: HIVE-26492 > URL: https://issues.apache.org/jira/browse/HIVE-26492 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 3.1.1 >Reporter: Yunhong Zheng >Priority: Major > Fix For: 3.1.1 > > Attachments: image-2022-08-22-22-05-53-412.png > > > For _orc.apache.orc.impl.writer.StructTreeWriter_ in hive-exec-3.1.1 . In > this class, method _writeFileStatistics_ will create a wrong column stats min > value (column stats min equals 0) when encounter decimal type column(like > decimal(5, 2), decimal(14, 2)). the debug screenshot: > !image-2022-08-22-21-06-29-980.png|width=405,height=280! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-26492) orc.apache.orc.impl.writer.StructTreeWriter write wrong column stats when encounter decimal type value
[ https://issues.apache.org/jira/browse/HIVE-26492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yunhong Zheng updated HIVE-26492: - Description: For _orc.apache.orc.impl.writer.StructTreeWriter_ in hive-exec-3.1.1 . In this class, method _writeFileStatistics_ will create a wrong column stats min value (column stats min equals 0) when encounter decimal type column(like decimal(5, 2), decimal(14, 2)). the debug screenshot: !image-2022-08-22-21-06-29-980.png|width=405,height=280! was: For _orc.apache.orc.impl.writer.StructTreeWriter_ in hive-exec-3.1.1 . In this class, method _writeFileStatistics_ will create a wrong column stats min value (column stats min equals 0) when encounter decimal type column(like decimal(5, 2), decimal(14, 2)). the debug screenshot: [link title| !image-2022-08-22-21-06-29-980.png! ] > orc.apache.orc.impl.writer.StructTreeWriter write wrong column stats when > encounter decimal type value > -- > > Key: HIVE-26492 > URL: https://issues.apache.org/jira/browse/HIVE-26492 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 3.1.1 >Reporter: Yunhong Zheng >Priority: Major > Fix For: 3.1.1 > > Attachments: image-2022-08-22-22-05-53-412.png > > > For _orc.apache.orc.impl.writer.StructTreeWriter_ in hive-exec-3.1.1 . In > this class, method _writeFileStatistics_ will create a wrong column stats min > value (column stats min equals 0) when encounter decimal type column(like > decimal(5, 2), decimal(14, 2)). the debug screenshot: > !image-2022-08-22-21-06-29-980.png|width=405,height=280! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26469) Remove stringifyException Method From QL Package
[ https://issues.apache.org/jira/browse/HIVE-26469?focusedWorklogId=802483&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-802483 ] ASF GitHub Bot logged work on HIVE-26469: - Author: ASF GitHub Bot Created on: 22/Aug/22 14:04 Start Date: 22/Aug/22 14:04 Worklog Time Spent: 10m Work Description: zabetak commented on code in PR #3518: URL: https://github.com/apache/hive/pull/3518#discussion_r951459418 ## ql/src/java/org/apache/hadoop/hive/ql/exec/tez/ReduceRecordSource.java: ## @@ -373,8 +370,7 @@ public void next() throws HiveException { try { rowString = SerDeUtils.getJSONString(row, rowObjectInspector); } catch (Exception e2) { - rowString = "[Error getting row data with exception " - + StringUtils.stringifyException(e2) + " ]"; + l4j.trace("Error getting row data (tag={})", tag, e2); Review Comment: The refactoring in `ExecReducer` was a bit different. Maybe it makes sense to keep things uniform. ## ql/src/java/org/apache/hadoop/hive/ql/exec/tez/ReduceRecordSource.java: ## @@ -502,8 +495,7 @@ private void processVectorGroup(BytesWritable keyWritable, try { rowString = batch.toString(); } catch (Exception e2) { -rowString = "[Error getting row data with exception " -+ StringUtils.stringifyException(e2) + " ]"; +l4j.error("Error getting row data (tag={})", tag, e2); Review Comment: The refactoring in `ExecReducer` is different. Moreover, previously we used `l4j.trace` and here we use `l4j.error`. The discussion in `HIVE-20644` indicates that maybe it is safer to log at trace level when it comes to data. This is outside the scope of this PR but maybe worth filing a JIRA case for this. ## ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFSum.java: ## @@ -286,11 +286,7 @@ public void iterate(AggregationBuffer agg, Object[] parameters) throws HiveExcep } catch (NumberFormatException e) { if (!warned) { warned = true; - LOG.warn(getClass().getSimpleName() + " " - + StringUtils.stringifyException(e)); - LOG - .warn(getClass().getSimpleName() - + " ignoring similar exceptions."); + LOG.warn("{}: ignoring similar exceptions", getClass().getSimpleName(), e); Review Comment: Remove unused `import org.apache.hadoop.util.StringUtils;` ## ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFSum.java: ## @@ -286,11 +286,7 @@ public void iterate(AggregationBuffer agg, Object[] parameters) throws HiveExcep } catch (NumberFormatException e) { if (!warned) { warned = true; - LOG.warn(getClass().getSimpleName() + " " - + StringUtils.stringifyException(e)); - LOG - .warn(getClass().getSimpleName() - + " ignoring similar exceptions."); + LOG.warn("{}: ignoring similar exceptions", getClass().getSimpleName(), e); Review Comment: The `getClass().getSimpleName()` is a bit useless/redundant for two reasons: * The class name will appear in the stacktrace; * The log configuration can be changed to include the class name if necessary. The comment applies to all changes in this class. The proposed refactoring though simply retains the old behavior so this is not a blocking comment. ## ql/src/java/org/apache/hadoop/hive/ql/exec/tez/ReduceRecordSource.java: ## @@ -398,15 +394,12 @@ private boolean pushRecordVector() { processVectorGroup(keyWritable, valueWritables, tag); return true; -} catch (Throwable e) { +} catch (OutOfMemoryError oom) { abort = true; - if (e instanceof OutOfMemoryError) { -// Don't create a new object if we are already out of memory -throw (OutOfMemoryError) e; - } else { -l4j.error(StringUtils.stringifyException(e)); -throw new RuntimeException(e); - } + // Do not create a new object if we are already out of memory + throw oom; +} catch (Throwable t) { + throw new RuntimeException(t); Review Comment: `abort = true;` is missing here. This is the only major review comment in this PR. Issue Time Tracking --- Worklog Id: (was: 802483) Time Spent: 20m (was: 10m) > Remove stringifyException Method From QL Package > > > Key: HIVE-26469 > URL: https://issues.apache.org/jira/browse/HIVE-26469 > Project: Hive > Issue Type: Sub-task >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Major > Labels: pull-request-available > Time Spent: 20m >
[jira] [Updated] (HIVE-26492) orc.apache.orc.impl.writer.StructTreeWriter write wrong column stats when encounter decimal type value
[ https://issues.apache.org/jira/browse/HIVE-26492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yunhong Zheng updated HIVE-26492: - Description: For _orc.apache.orc.impl.writer.StructTreeWriter_ in hive-exec-3.1.1 . In this class, method _writeFileStatistics_ will create a wrong column stats min value (column stats min equals 0) when encounter decimal type column(like decimal(5, 2), decimal(14, 2)). the debug screenshot: [link title| !image-2022-08-22-21-06-29-980.png! ] was: For _orc.apache.orc.impl.writer.StructTreeWriter_ in hive-exec-3.1.1 . In this class, method _writeFileStatistics_ will create a wrong column stats min value (column stats min equals 0) when encounter decimal type column(like decimal(5, 2), decimal(14, 2)). the debug screenshot: !image-2022-08-22-21-06-29-980.png|width=454,height=314! > orc.apache.orc.impl.writer.StructTreeWriter write wrong column stats when > encounter decimal type value > -- > > Key: HIVE-26492 > URL: https://issues.apache.org/jira/browse/HIVE-26492 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 3.1.1 >Reporter: Yunhong Zheng >Priority: Major > Fix For: 3.1.1 > > > For _orc.apache.orc.impl.writer.StructTreeWriter_ in hive-exec-3.1.1 . In > this class, method _writeFileStatistics_ will create a wrong column stats min > value (column stats min equals 0) when encounter decimal type column(like > decimal(5, 2), decimal(14, 2)). the debug screenshot: > [link title| !image-2022-08-22-21-06-29-980.png! ] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-26492) orc.apache.orc.impl.writer.StructTreeWriter write wrong column stats when encounter decimal type value
[ https://issues.apache.org/jira/browse/HIVE-26492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yunhong Zheng updated HIVE-26492: - Description: For _orc.apache.orc.impl.writer.StructTreeWriter_ in hive-exec-3.1.1 . In this class, method _writeFileStatistics_ will create a wrong column stats min value (column stats min equals 0) when encounter decimal type column(like decimal(5, 2), decimal(14, 2)). the debug screenshot: !image-2022-08-22-21-06-29-980.png|width=454,height=314! was: For _orc.apache.orc.impl.writer.StructTreeWriter_ in hive-exec-3.1.1 . In this class, method _writeFileStatistics_ will create a wrong column stats min value (column stats min equals 0) when encounter decimal type column(like decimal(5, 2), decimal(14, 2)). the debug screenshot: !image-2022-08-22-21-06-29-980.png! > orc.apache.orc.impl.writer.StructTreeWriter write wrong column stats when > encounter decimal type value > -- > > Key: HIVE-26492 > URL: https://issues.apache.org/jira/browse/HIVE-26492 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 3.1.1 >Reporter: Yunhong Zheng >Priority: Major > Fix For: 3.1.1 > > > For _orc.apache.orc.impl.writer.StructTreeWriter_ in hive-exec-3.1.1 . In > this class, method _writeFileStatistics_ will create a wrong column stats min > value (column stats min equals 0) when encounter decimal type column(like > decimal(5, 2), decimal(14, 2)). the debug screenshot: > !image-2022-08-22-21-06-29-980.png|width=454,height=314! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26443) Add priority queueing to compaction
[ https://issues.apache.org/jira/browse/HIVE-26443?focusedWorklogId=802479&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-802479 ] ASF GitHub Bot logged work on HIVE-26443: - Author: ASF GitHub Bot Created on: 22/Aug/22 13:37 Start Date: 22/Aug/22 13:37 Worklog Time Spent: 10m Work Description: veghlaci05 commented on code in PR #3513: URL: https://github.com/apache/hive/pull/3513#discussion_r951451013 ## standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java: ## @@ -3844,34 +3847,51 @@ protected static String compactorStateToResponse(char s) { public ShowCompactResponse showCompact(ShowCompactRequest rqst) throws MetaException { ShowCompactResponse response = new ShowCompactResponse(new ArrayList<>()); Connection dbConn = null; -Statement stmt = null; +PreparedStatement stmt = null; try { try { -dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED); -stmt = dbConn.createStatement(); -String s = "" + -//-1 because 'null' literal doesn't work for all DBs... +StringBuilder sb =new StringBuilder(2048); +sb.append( "SELECT " + " \"CQ_DATABASE\", \"CQ_TABLE\", \"CQ_PARTITION\", \"CQ_STATE\", \"CQ_TYPE\", \"CQ_WORKER_ID\", " + " \"CQ_START\", -1 \"CC_END\", \"CQ_RUN_AS\", \"CQ_HADOOP_JOB_ID\", \"CQ_ID\", \"CQ_ERROR_MESSAGE\", " + " \"CQ_ENQUEUE_TIME\", \"CQ_WORKER_VERSION\", \"CQ_INITIATOR_ID\", \"CQ_INITIATOR_VERSION\", " + -" \"CQ_CLEANER_START\"" + +" \"CQ_CLEANER_START\", \"CQ_POOL_NAME\"" + "FROM " + -" \"COMPACTION_QUEUE\" " + +" \"COMPACTION_QUEUE\" " +); +if (org.apache.commons.lang3.StringUtils.isNotBlank(rqst.getPoolName())) { + sb.append("WHERE \"CQ_POOL_NAME\" = ? "); +} +sb.append( "UNION ALL " + "SELECT " + " \"CC_DATABASE\", \"CC_TABLE\", \"CC_PARTITION\", \"CC_STATE\", \"CC_TYPE\", \"CC_WORKER_ID\", " + " \"CC_START\", \"CC_END\", \"CC_RUN_AS\", \"CC_HADOOP_JOB_ID\", \"CC_ID\", \"CC_ERROR_MESSAGE\", " + " \"CC_ENQUEUE_TIME\", \"CC_WORKER_VERSION\", \"CC_INITIATOR_ID\", \"CC_INITIATOR_VERSION\", " + -" -1 " + +" -1 , \"CC_POOL_NAME\"" + "FROM " + -" \"COMPLETED_COMPACTIONS\""; //todo: sort by cq_id? +" \"COMPLETED_COMPACTIONS\" " +); +if (org.apache.commons.lang3.StringUtils.isNotBlank(rqst.getPoolName())) { + sb.append("WHERE \"CC_POOL_NAME\" = ?"); +} +//todo: sort by cq_id? //what I want is order by cc_end desc, cc_start asc (but derby has a bug https://issues.apache.org/jira/browse/DERBY-6013) //to sort so that currently running jobs are at the end of the list (bottom of screen) //and currently running ones are in sorted by start time //w/o order by likely currently running compactions will be first (LHS of Union) -LOG.debug("Going to execute query <" + s + ">"); -ResultSet rs = stmt.executeQuery(s); + +String query = sb.toString(); +dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED); +stmt = dbConn.prepareStatement(query); +if (org.apache.commons.lang3.StringUtils.isNotBlank(rqst.getPoolName())) { Review Comment: Due to your comment I realized that I forgot to enhance the `SHOW COMPACTIONS `command with the `POOL 'poolname'` part :). It will be in my next commit along with the fixes of your points. Regarding the prepared statement, the reason behind it is the same like for the ALTER TABLE COMPACT command: The pool is coming from the user and could be a subject of SQL injection. Issue Time Tracking --- Worklog Id: (was: 802479) Time Spent: 6h (was: 5h 50m) > Add priority queueing to compaction > --- > > Key: HIVE-26443 > URL: https://issues.apache.org/jira/browse/HIVE-26443 > Project: Hive > Issue Type: New Feature >Reporter: László Végh >Assignee: László Végh >Priority: Major > Labels: pull-request-available > Attachments: Pool based compaction queues.docx > > Time Spent: 6h > Remaining Estimate: 0h > > The details can be found in the attached design doc. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26482) Create a unit test checking compaction output file names on a partitioned table
[ https://issues.apache.org/jira/browse/HIVE-26482?focusedWorklogId=802477&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-802477 ] ASF GitHub Bot logged work on HIVE-26482: - Author: ASF GitHub Bot Created on: 22/Aug/22 13:32 Start Date: 22/Aug/22 13:32 Worklog Time Spent: 10m Work Description: InvisibleProgrammer opened a new pull request, #3532: URL: https://github.com/apache/hive/pull/3532 There is no test about the output directory names after running compaction on partitions Compaction output directories' writeIds only reflect the writeIds of the deltas it compacts, and not the max write id of the table Example: ``` Pre-compaction... Partition p=1 contains: delta_1_1 delta_2_2 partition p=2 contains delta_3_3 delta_4_4 After minor compaction... Partition p=1 contains: delta_1_2 partition p=2 contains delta_3_4 ``` ### What changes were proposed in this pull request? New test added: `testCompactionOutputDirectoryNamesOnPartitions` ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? New test added: `testCompactionOutputDirectoryNamesOnPartitions` Issue Time Tracking --- Worklog Id: (was: 802477) Time Spent: 0.5h (was: 20m) > Create a unit test checking compaction output file names on a partitioned > table > --- > > Key: HIVE-26482 > URL: https://issues.apache.org/jira/browse/HIVE-26482 > Project: Hive > Issue Type: Test > Components: Hive >Reporter: Zsolt Miskolczi >Assignee: Zsolt Miskolczi >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Compaction output directories' writeIds only reflect the writeIds of the > deltas it compacts, and not the max write id of the table > Example: > Pre-compaction... > {code:java} > Partition p=1 contains: > delta_1_1 > delta_2_2 > partition p=2 contains > delta_3_3 > delta_4_4 > {code} > After minor compaction... > {code:java} > Partition p=1 contains: > delta_1_2 > partition p=2 contains > delta_3_4 > {code} > AFAIK there are no unit tests that reflect this. > TestTxnCommands2#testFullACIDAbortWithManyPartitions is a good template to > start with. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26482) Create a unit test checking compaction output file names on a partitioned table
[ https://issues.apache.org/jira/browse/HIVE-26482?focusedWorklogId=802475&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-802475 ] ASF GitHub Bot logged work on HIVE-26482: - Author: ASF GitHub Bot Created on: 22/Aug/22 13:30 Start Date: 22/Aug/22 13:30 Worklog Time Spent: 10m Work Description: InvisibleProgrammer closed pull request #3532: HIVE-26482: Add test to check names after compaction on partition URL: https://github.com/apache/hive/pull/3532 Issue Time Tracking --- Worklog Id: (was: 802475) Time Spent: 20m (was: 10m) > Create a unit test checking compaction output file names on a partitioned > table > --- > > Key: HIVE-26482 > URL: https://issues.apache.org/jira/browse/HIVE-26482 > Project: Hive > Issue Type: Test > Components: Hive >Reporter: Zsolt Miskolczi >Assignee: Zsolt Miskolczi >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Compaction output directories' writeIds only reflect the writeIds of the > deltas it compacts, and not the max write id of the table > Example: > Pre-compaction... > {code:java} > Partition p=1 contains: > delta_1_1 > delta_2_2 > partition p=2 contains > delta_3_3 > delta_4_4 > {code} > After minor compaction... > {code:java} > Partition p=1 contains: > delta_1_2 > partition p=2 contains > delta_3_4 > {code} > AFAIK there are no unit tests that reflect this. > TestTxnCommands2#testFullACIDAbortWithManyPartitions is a good template to > start with. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-24083) hcatalog error in Hadoop 3.3.0: authentication type needed
[ https://issues.apache.org/jira/browse/HIVE-24083?focusedWorklogId=802452&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-802452 ] ASF GitHub Bot logged work on HIVE-24083: - Author: ASF GitHub Bot Created on: 22/Aug/22 12:12 Start Date: 22/Aug/22 12:12 Worklog Time Spent: 10m Work Description: zabetak commented on PR #3520: URL: https://github.com/apache/hive/pull/3520#issuecomment-170412 I left some comments under the JIRA. Keep in mind that whatever lands in branch-3.1 (or older branches) must first land in `master`. Issue Time Tracking --- Worklog Id: (was: 802452) Time Spent: 40m (was: 0.5h) > hcatalog error in Hadoop 3.3.0: authentication type needed > -- > > Key: HIVE-24083 > URL: https://issues.apache.org/jira/browse/HIVE-24083 > Project: Hive > Issue Type: Bug > Components: HCatalog >Affects Versions: 3.1.2 >Reporter: Javier J. Salmeron Garcia >Priority: Minor > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > Using Hive 3.1.2, webhcat fails to start in Hadoop 3.3.0 with the following > error: > ``` > javax.servlet.ServletException: Authentication type must be specified: > simple|kerberos > ``` > I tried in Hadoop 3.2.1 with the exact settings and it starts without issues: > > ``` > webhcat: /tmp/hadoop-3.2.1//bin/hadoop jar > /opt/bitnami/hadoop/hive/hcatalog/sbin/../share/webhcat/svr/lib/hive-webhcat-3.1.2.jar > org.apache.hive.hcatalog.templeton.Main > webhcat: starting ... started. > webhcat: done > ``` > > I can provide more logs if needed. Detected authentication settings: > > ``` > hadoop.http.authentication.simple.anonymous.allowed=true > hadoop.http.authentication.type=simple > hadoop.security.authentication=simple > ipc.client.fallback-to-simple-auth-allowed=false > yarn.timeline-service.http-authentication.simple.anonymous.allowed=true > yarn.timeline-service.http-authentication.type=simple > ``` > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HIVE-24083) hcatalog error in Hadoop 3.3.0: authentication type needed
[ https://issues.apache.org/jira/browse/HIVE-24083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17582936#comment-17582936 ] Stamatis Zampetakis commented on HIVE-24083: Hive does not claim to support multiple Hadoop versions. From what I can see for Hive 3.x the supported Hadoop version is [3.1.0|https://github.com/apache/hive/blob/01924c8d54b146dee44c30d104fb60fbcdec0e87/pom.xml#L151]. There are no guarantees or requirements to work with newer or older Hadoop versions so this may never get fixed in 3.x releases. If the problem affects master then please include steps to reproduce and unit/integration tests as part of the PR. > hcatalog error in Hadoop 3.3.0: authentication type needed > -- > > Key: HIVE-24083 > URL: https://issues.apache.org/jira/browse/HIVE-24083 > Project: Hive > Issue Type: Bug > Components: HCatalog >Affects Versions: 3.1.2 >Reporter: Javier J. Salmeron Garcia >Priority: Minor > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Using Hive 3.1.2, webhcat fails to start in Hadoop 3.3.0 with the following > error: > ``` > javax.servlet.ServletException: Authentication type must be specified: > simple|kerberos > ``` > I tried in Hadoop 3.2.1 with the exact settings and it starts without issues: > > ``` > webhcat: /tmp/hadoop-3.2.1//bin/hadoop jar > /opt/bitnami/hadoop/hive/hcatalog/sbin/../share/webhcat/svr/lib/hive-webhcat-3.1.2.jar > org.apache.hive.hcatalog.templeton.Main > webhcat: starting ... started. > webhcat: done > ``` > > I can provide more logs if needed. Detected authentication settings: > > ``` > hadoop.http.authentication.simple.anonymous.allowed=true > hadoop.http.authentication.type=simple > hadoop.security.authentication=simple > ipc.client.fallback-to-simple-auth-allowed=false > yarn.timeline-service.http-authentication.simple.anonymous.allowed=true > yarn.timeline-service.http-authentication.type=simple > ``` > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26407) Do not collect statistics if the compaction fails
[ https://issues.apache.org/jira/browse/HIVE-26407?focusedWorklogId=802432&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-802432 ] ASF GitHub Bot logged work on HIVE-26407: - Author: ASF GitHub Bot Created on: 22/Aug/22 10:30 Start Date: 22/Aug/22 10:30 Worklog Time Spent: 10m Work Description: deniskuzZ commented on code in PR #3489: URL: https://github.com/apache/hive/pull/3489#discussion_r951273580 ## ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/StatsUpdater.java: ## @@ -0,0 +1,84 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hive.ql.txn.compactor; + +import org.apache.hadoop.hive.common.ValidTxnList; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.metastore.Warehouse; +import org.apache.hadoop.hive.metastore.txn.CompactionInfo; +import org.apache.hadoop.hive.ql.DriverUtils; +import org.apache.hadoop.hive.ql.session.SessionState; +import org.apache.hadoop.hive.ql.stats.StatsUtils; +import org.apache.tez.dag.api.TezConfiguration; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.Map; + +public final class StatsUpdater { Review Comment: Javadoc is missing Issue Time Tracking --- Worklog Id: (was: 802432) Time Spent: 4h (was: 3h 50m) > Do not collect statistics if the compaction fails > - > > Key: HIVE-26407 > URL: https://issues.apache.org/jira/browse/HIVE-26407 > Project: Hive > Issue Type: Test > Components: Hive >Reporter: Zsolt Miskolczi >Assignee: Zsolt Miskolczi >Priority: Minor > Labels: pull-request-available > Time Spent: 4h > Remaining Estimate: 0h > > It can still compute statistics, even if compaction fails. > if (computeStats) \{ > StatsUpdater.gatherStats(ci, conf, runJobAsSelf(ci.runAs) ? ci.runAs : > t1.getOwner(), > CompactorUtil.getCompactorJobQueueName(conf, ci, t1)); > } -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26443) Add priority queueing to compaction
[ https://issues.apache.org/jira/browse/HIVE-26443?focusedWorklogId=802430&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-802430 ] ASF GitHub Bot logged work on HIVE-26443: - Author: ASF GitHub Bot Created on: 22/Aug/22 10:18 Start Date: 22/Aug/22 10:18 Worklog Time Spent: 10m Work Description: veghlaci05 commented on code in PR #3513: URL: https://github.com/apache/hive/pull/3513#discussion_r951263511 ## standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java: ## @@ -3783,9 +3785,10 @@ public boolean submitForCleanup(CompactionRequest rqst, long highestWriteId, lon buf.append("\"CQ_PARTITION\", "); params.add(partName); } -buf.append("\"CQ_STATE\", \"CQ_TYPE\""); +buf.append("\"CQ_STATE\", \"CQ_TYPE\", \"CQ_POOL_NAME\""); Review Comment: You are right, this method creates the entries with `READY_FOR_CLEANING` status. If there's no use case where these requests are somehow picked up by the Worker, this change can be reverted. Issue Time Tracking --- Worklog Id: (was: 802430) Time Spent: 5h 50m (was: 5h 40m) > Add priority queueing to compaction > --- > > Key: HIVE-26443 > URL: https://issues.apache.org/jira/browse/HIVE-26443 > Project: Hive > Issue Type: New Feature >Reporter: László Végh >Assignee: László Végh >Priority: Major > Labels: pull-request-available > Attachments: Pool based compaction queues.docx > > Time Spent: 5h 50m > Remaining Estimate: 0h > > The details can be found in the attached design doc. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26443) Add priority queueing to compaction
[ https://issues.apache.org/jira/browse/HIVE-26443?focusedWorklogId=802426&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-802426 ] ASF GitHub Bot logged work on HIVE-26443: - Author: ASF GitHub Bot Created on: 22/Aug/22 10:13 Start Date: 22/Aug/22 10:13 Worklog Time Spent: 10m Work Description: veghlaci05 commented on code in PR #3513: URL: https://github.com/apache/hive/pull/3513#discussion_r951259129 ## standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java: ## @@ -3711,6 +3711,8 @@ public CompactionResponse compact(CompactionRequest rqst) throws MetaException { buf.append(thriftCompactionType2DbType(rqst.getType())); buf.append("',"); buf.append(getEpochFn(dbProduct)); +buf.append(", ?"); +params.add(rqst.getPoolName()); Review Comment: According to the tests in `org.apache.hadoop.hive.ql.txn.compactor.TestCompactionPools` yes. Issue Time Tracking --- Worklog Id: (was: 802426) Time Spent: 5h 40m (was: 5.5h) > Add priority queueing to compaction > --- > > Key: HIVE-26443 > URL: https://issues.apache.org/jira/browse/HIVE-26443 > Project: Hive > Issue Type: New Feature >Reporter: László Végh >Assignee: László Végh >Priority: Major > Labels: pull-request-available > Attachments: Pool based compaction queues.docx > > Time Spent: 5h 40m > Remaining Estimate: 0h > > The details can be found in the attached design doc. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26443) Add priority queueing to compaction
[ https://issues.apache.org/jira/browse/HIVE-26443?focusedWorklogId=802425&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-802425 ] ASF GitHub Bot logged work on HIVE-26443: - Author: ASF GitHub Bot Created on: 22/Aug/22 10:05 Start Date: 22/Aug/22 10:05 Worklog Time Spent: 10m Work Description: veghlaci05 commented on code in PR #3513: URL: https://github.com/apache/hive/pull/3513#discussion_r951251996 ## standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java: ## @@ -229,17 +231,33 @@ public CompactionInfo findNextToCompact(FindNextCompactRequest rqst) throws Meta } Connection dbConn = null; - Statement stmt = null; + PreparedStatement stmt = null; //need a separate stmt for executeUpdate() otherwise it will close the ResultSet(HIVE-12725) Statement updStmt = null; ResultSet rs = null; + + long poolTimeout = MetastoreConf.getTimeVar(conf, ConfVars.COMPACTOR_WORKER_POOL_TIMEOUT, TimeUnit.MILLISECONDS); + try { dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED, connPoolCompaction); -stmt = dbConn.createStatement(); -String query = "SELECT \"CQ_ID\", \"CQ_DATABASE\", \"CQ_TABLE\", \"CQ_PARTITION\", " + - "\"CQ_TYPE\", \"CQ_TBLPROPERTIES\" FROM \"COMPACTION_QUEUE\" WHERE \"CQ_STATE\" = '" + INITIATED_STATE + "'"; +StringBuilder sb = new StringBuilder(); +sb.append("SELECT \"CQ_ID\", \"CQ_DATABASE\", \"CQ_TABLE\", \"CQ_PARTITION\", " + + "\"CQ_TYPE\", \"CQ_POOL_NAME\", \"CQ_TBLPROPERTIES\" FROM \"COMPACTION_QUEUE\" WHERE \"CQ_STATE\" = '" + INITIATED_STATE + "' AND "); +boolean hasPoolName = org.apache.commons.lang3.StringUtils.isNotBlank(rqst.getPoolName()); +if(hasPoolName) { + sb.append("\"CQ_POOL_NAME\"=?"); +} else { + sb.append("\"CQ_POOL_NAME\" is null OR \"CQ_ENQUEUE_TIME\" < (") +.append(getEpochFn(dbProduct)).append(" - ").append(poolTimeout).append(")"); +} +String query = sb.toString(); +stmt = dbConn.prepareStatement(query); +if (hasPoolName) { + stmt.setString(1, rqst.getPoolName()); Review Comment: I did not want to directly concatenate the pool name filtering into the SQL statement. Pool name can be passed as a part of the `ALTER TABLE COMPACT` command, and therefore could be a target of SQL injection attempts. Issue Time Tracking --- Worklog Id: (was: 802425) Time Spent: 5.5h (was: 5h 20m) > Add priority queueing to compaction > --- > > Key: HIVE-26443 > URL: https://issues.apache.org/jira/browse/HIVE-26443 > Project: Hive > Issue Type: New Feature >Reporter: László Végh >Assignee: László Végh >Priority: Major > Labels: pull-request-available > Attachments: Pool based compaction queues.docx > > Time Spent: 5.5h > Remaining Estimate: 0h > > The details can be found in the attached design doc. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26481) Cleaner fails with FileNotFoundException
[ https://issues.apache.org/jira/browse/HIVE-26481?focusedWorklogId=802424&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-802424 ] ASF GitHub Bot logged work on HIVE-26481: - Author: ASF GitHub Bot Created on: 22/Aug/22 10:01 Start Date: 22/Aug/22 10:01 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #3531: URL: https://github.com/apache/hive/pull/3531#issuecomment-1222130792 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=3531) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3531&resolved=false&types=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3531&resolved=false&types=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3531&resolved=false&types=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3531&resolved=false&types=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3531&resolved=false&types=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3531&resolved=false&types=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3531&resolved=false&types=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3531&resolved=false&types=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3531&resolved=false&types=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3531&resolved=false&types=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3531&resolved=false&types=CODE_SMELL) [12 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3531&resolved=false&types=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3531&metric=coverage&view=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3531&metric=duplicated_lines_density&view=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 802424) Time Spent: 1h 50m (was: 1h 40m) > Cleaner fails with FileNotFoundException > > > Key: HIVE-26481 > URL: https://issues.apache.org/jira/browse/HIVE-26481 > Project: Hive > Issue Type: Bug >Reporter: KIRTI RUGE >Priority: Major > Labels: pull-request-available > Time Spent: 1h 50m > Remaining Estimate: 0h > > The compaction fails when the Cleaner tried to remove a missing directory > from HDFS. > {code:java} > 2022-08-05 18:56:38,873 INFO org.apache.hadoop.hive.ql.txn.compactor.Cleaner: > [Cleaner-executor-thread-0]: Starting cleaning for > id:30,dbname:default,tableName:test_concur_compaction_minor,partName:null,state:�,type:MINOR,enqueueTime:0,start:0,properties:null,runAs:hive,tooManyAborts:false,hasOldAbort:false,highestWriteId:4,errorMessage:null,workerId: > null,initiatorId: null 2022-08-05 18:56:38,888 ERROR > org.apache.hadoop.hive.ql.txn.compactor.Cleaner: [Cleaner-executor-thread-0]: > Caught exception when cleaning, unable to complete cleaning of > id:30,
[jira] [Work logged] (HIVE-26443) Add priority queueing to compaction
[ https://issues.apache.org/jira/browse/HIVE-26443?focusedWorklogId=802423&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-802423 ] ASF GitHub Bot logged work on HIVE-26443: - Author: ASF GitHub Bot Created on: 22/Aug/22 10:00 Start Date: 22/Aug/22 10:00 Worklog Time Spent: 10m Work Description: veghlaci05 commented on code in PR #3513: URL: https://github.com/apache/hive/pull/3513#discussion_r951247036 ## ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Worker.java: ## @@ -317,6 +318,15 @@ protected Boolean findNextCompactionAndExecute(boolean collectGenericStats, bool if ((runtimeVersion != null || ci.initiatorVersion != null) && !runtimeVersion.equals(ci.initiatorVersion)) { LOG.warn("Worker and Initiator versions do not match. Worker: v{}, Initiator: v{}", runtimeVersion, ci.initiatorVersion); } + + if (StringUtils.isBlank(getPoolName()) && StringUtils.isNotBlank(ci.poolName)) { +LOG.warn("A timed out copmaction pool entry ({}) is picked up by one of the default compaction pool workers.", ci); + } + if (StringUtils.isNotBlank(getPoolName()) && StringUtils.isNotBlank(ci.poolName) && !getPoolName().equals(ci.poolName)) { Review Comment: This normally should not happen at all, because the query filters the items by pool name. However, I wanted to cover this case as well. If a labeled (not the default) pool somehow gets a request assigned to another or the default pool, I think it should not be processed. - Simply skipping it could be problematic if the item gets returned by `findNextCompact()` again and again. In this case this item will stuck in `initiated` state, and if there's only one worker assigned to the pool, it will even stuck the entire queue processing for that pool. - Marking it as failed with a proper error message seemed to be more convenient, which won't cause processing anomalies and is easier to track down later. Issue Time Tracking --- Worklog Id: (was: 802423) Time Spent: 5h 20m (was: 5h 10m) > Add priority queueing to compaction > --- > > Key: HIVE-26443 > URL: https://issues.apache.org/jira/browse/HIVE-26443 > Project: Hive > Issue Type: New Feature >Reporter: László Végh >Assignee: László Végh >Priority: Major > Labels: pull-request-available > Attachments: Pool based compaction queues.docx > > Time Spent: 5h 20m > Remaining Estimate: 0h > > The details can be found in the attached design doc. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26476) Iceberg: map "ORCFILE" to "ORC" while creating an iceberg table
[ https://issues.apache.org/jira/browse/HIVE-26476?focusedWorklogId=802419&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-802419 ] ASF GitHub Bot logged work on HIVE-26476: - Author: ASF GitHub Bot Created on: 22/Aug/22 09:52 Start Date: 22/Aug/22 09:52 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #3525: URL: https://github.com/apache/hive/pull/3525#issuecomment-1222121207 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=3525) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3525&resolved=false&types=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3525&resolved=false&types=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3525&resolved=false&types=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3525&resolved=false&types=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3525&resolved=false&types=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3525&resolved=false&types=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3525&resolved=false&types=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3525&resolved=false&types=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3525&resolved=false&types=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3525&resolved=false&types=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3525&resolved=false&types=CODE_SMELL) [10 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3525&resolved=false&types=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3525&metric=coverage&view=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3525&metric=duplicated_lines_density&view=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 802419) Time Spent: 0.5h (was: 20m) > Iceberg: map "ORCFILE" to "ORC" while creating an iceberg table > --- > > Key: HIVE-26476 > URL: https://issues.apache.org/jira/browse/HIVE-26476 > Project: Hive > Issue Type: Bug >Reporter: Manthan B Y >Assignee: László Pintér >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > *Issue:* Insert query failing with VERTEX_FAILURE > *Steps to Reproduce:* > # Open Beeline session > # Execute the following queries > {code:java} > DROP TABLE IF EXISTS t2; > CREATE TABLE IF NOT EXISTS t2(c0 DOUBLE , c1 DOUBLE , c2 DECIMAL) STORED BY > ICEBERG STORED AS ORCFILE; > INSERT INTO t2(c1, c0) VALUES(0.1803113419993464, 0.9381388537256228);{code} > *Result:* > {code:java} > org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getRecordWriter(HiveFileFormatUtils.java:294) > at > org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:279) > ..
[jira] [Work logged] (HIVE-26443) Add priority queueing to compaction
[ https://issues.apache.org/jira/browse/HIVE-26443?focusedWorklogId=802417&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-802417 ] ASF GitHub Bot logged work on HIVE-26443: - Author: ASF GitHub Bot Created on: 22/Aug/22 09:49 Start Date: 22/Aug/22 09:49 Worklog Time Spent: 10m Work Description: veghlaci05 commented on code in PR #3513: URL: https://github.com/apache/hive/pull/3513#discussion_r951236348 ## standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java: ## @@ -229,17 +231,33 @@ public CompactionInfo findNextToCompact(FindNextCompactRequest rqst) throws Meta } Connection dbConn = null; - Statement stmt = null; + PreparedStatement stmt = null; //need a separate stmt for executeUpdate() otherwise it will close the ResultSet(HIVE-12725) Statement updStmt = null; ResultSet rs = null; + + long poolTimeout = MetastoreConf.getTimeVar(conf, ConfVars.COMPACTOR_WORKER_POOL_TIMEOUT, TimeUnit.MILLISECONDS); + try { dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED, connPoolCompaction); -stmt = dbConn.createStatement(); -String query = "SELECT \"CQ_ID\", \"CQ_DATABASE\", \"CQ_TABLE\", \"CQ_PARTITION\", " + - "\"CQ_TYPE\", \"CQ_TBLPROPERTIES\" FROM \"COMPACTION_QUEUE\" WHERE \"CQ_STATE\" = '" + INITIATED_STATE + "'"; +StringBuilder sb = new StringBuilder(); +sb.append("SELECT \"CQ_ID\", \"CQ_DATABASE\", \"CQ_TABLE\", \"CQ_PARTITION\", " + + "\"CQ_TYPE\", \"CQ_POOL_NAME\", \"CQ_TBLPROPERTIES\" FROM \"COMPACTION_QUEUE\" WHERE \"CQ_STATE\" = '" + INITIATED_STATE + "' AND "); +boolean hasPoolName = org.apache.commons.lang3.StringUtils.isNotBlank(rqst.getPoolName()); Review Comment: Unfortunately not. `org.apache.hadoop.util.StringUtils` is already imported, and used 23 times. It has the `stringifyException(e)` method, which org.apache.commons.lang3.StringUtils does not contain. On the other side, it doesn't have any methods for string emptiness check. Since both classes are required, I decided to fully qualify the lang3 one, since it is used fewer times. Issue Time Tracking --- Worklog Id: (was: 802417) Time Spent: 5h 10m (was: 5h) > Add priority queueing to compaction > --- > > Key: HIVE-26443 > URL: https://issues.apache.org/jira/browse/HIVE-26443 > Project: Hive > Issue Type: New Feature >Reporter: László Végh >Assignee: László Végh >Priority: Major > Labels: pull-request-available > Attachments: Pool based compaction queues.docx > > Time Spent: 5h 10m > Remaining Estimate: 0h > > The details can be found in the attached design doc. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26443) Add priority queueing to compaction
[ https://issues.apache.org/jira/browse/HIVE-26443?focusedWorklogId=802410&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-802410 ] ASF GitHub Bot logged work on HIVE-26443: - Author: ASF GitHub Bot Created on: 22/Aug/22 09:41 Start Date: 22/Aug/22 09:41 Worklog Time Spent: 10m Work Description: veghlaci05 commented on code in PR #3513: URL: https://github.com/apache/hive/pull/3513#discussion_r951228152 ## standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java: ## @@ -202,16 +202,18 @@ public Set findPotentialCompactions(int abortedThreshold, * This will grab the next compaction request off of * the queue, and assign it to the worker. * @param workerId id of the worker calling this, will be recorded in the db + * @param poolName id of the worker calling this, will be recorded in the db * @deprecated Replaced by * {@link CompactionTxnHandler#findNextToCompact(org.apache.hadoop.hive.metastore.api.FindNextCompactRequest)} * @return an info element for next compaction in the queue, or null if there is no work to do now. */ @Deprecated Review Comment: It's kept for backward compatibility reasons. `public CompactionInfo findNextToCompact(FindNextCompactRequest rqst) throws MetaException` should be used instead. Issue Time Tracking --- Worklog Id: (was: 802410) Time Spent: 5h (was: 4h 50m) > Add priority queueing to compaction > --- > > Key: HIVE-26443 > URL: https://issues.apache.org/jira/browse/HIVE-26443 > Project: Hive > Issue Type: New Feature >Reporter: László Végh >Assignee: László Végh >Priority: Major > Labels: pull-request-available > Attachments: Pool based compaction queues.docx > > Time Spent: 5h > Remaining Estimate: 0h > > The details can be found in the attached design doc. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (HIVE-16913) Support per-session S3 credentials
[ https://issues.apache.org/jira/browse/HIVE-16913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17582822#comment-17582822 ] zhangbutao edited comment on HIVE-16913 at 8/22/22 9:07 AM: Any update? [~ste...@apache.org] [~vihangk1] Can we take advantage of HADOOP-14556 to acheive per-session S3 credentials in hive? In my opinion, it is more easy to implement this feature by propagating s3 parameter(fs.s3a.secret.key,fs.s3a.access.key) to HMSHandler dynamically. In short, we should make it work when client sets these parameters: {code:java} set fs.s3a.secret.key=my_secret_key; set fs.s3a.access.key=my_access.key; set metaconf:fs.s3a.secret.key=my_secret_key; set metaconf:fs.s3a.access.key=my_access_key; {code} was (Author: zhangbutao): Any update? [~ste...@apache.org] [~vihangk1] Can we take advantage of HADOOP-14556 to acheive per-session S3 credentials in hive? In my opinion, it is more easy to implement this feature by propagating s3 parameter(fs.s3a.secret.key,fs.s3a.access.key) to HMSHandler dynamically. In short, we should make it work when client set these parameters: {code:java} set fs.s3a.secret.key=my_secret_key; set fs.s3a.access.key=my_access.key; set metaconf:fs.s3a.secret.key=my_secret_key; set metaconf:fs.s3a.access.key=my_access_key; {code} > Support per-session S3 credentials > -- > > Key: HIVE-16913 > URL: https://issues.apache.org/jira/browse/HIVE-16913 > Project: Hive > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > > Currently, the credentials needed to support Hive-on-S3 (or any other > cloud-storage) need to be to the hive-site.xml. Either using a hadoop > credential provider or by adding the keys in the hive-site.xml in plain text > (unsecure) > This limits the usecase to using a single S3 key. If we configure per bucket > s3 keys like described [here | > http://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Configurations_different_S3_buckets] > it exposes the access to all the buckets to all the hive users. > It is possible that there are different sets of users who would not like to > share there buckets and still be able to process the data using Hive. > Enabling session level credentials will help solve such use-cases. For > example, currently this doesn't work > {noformat} > set fs.s3a.secret.key=my_secret_key; > set fs.s3a.access.key=my_access.key; > {noformat} > Because metastore is unaware of the the keys. This doesn't work either > {noformat} > set fs.s3a.secret.key=my_secret_key; > set fs.s3a.access.key=my_access.key; > set metaconf:fs.s3a.secret.key=my_secret_key; > set metaconf:fs.s3a.access.key=my_access_key; > {noformat} > This is because only a certain metastore configurations defined in > {{HiveConf.MetaVars}} are allowed to be set by the user. If we enable the > above approaches we could potentially allow multiple S3 credentials on a > per-session level basis. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HIVE-16913) Support per-session S3 credentials
[ https://issues.apache.org/jira/browse/HIVE-16913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17582822#comment-17582822 ] zhangbutao commented on HIVE-16913: --- Any update? [~ste...@apache.org] [~vihangk1] Can we take advantage of HADOOP-14556 to acheive per-session S3 credentials in hive? In my opinion, it is more easy to implement this feature by propagating s3 parameter(fs.s3a.secret.key,fs.s3a.access.key) to HMSHandler dynamically. In short, we should make it work when client set these parameters: {code:java} set fs.s3a.secret.key=my_secret_key; set fs.s3a.access.key=my_access.key; set metaconf:fs.s3a.secret.key=my_secret_key; set metaconf:fs.s3a.access.key=my_access_key; {code} > Support per-session S3 credentials > -- > > Key: HIVE-16913 > URL: https://issues.apache.org/jira/browse/HIVE-16913 > Project: Hive > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > > Currently, the credentials needed to support Hive-on-S3 (or any other > cloud-storage) need to be to the hive-site.xml. Either using a hadoop > credential provider or by adding the keys in the hive-site.xml in plain text > (unsecure) > This limits the usecase to using a single S3 key. If we configure per bucket > s3 keys like described [here | > http://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html#Configurations_different_S3_buckets] > it exposes the access to all the buckets to all the hive users. > It is possible that there are different sets of users who would not like to > share there buckets and still be able to process the data using Hive. > Enabling session level credentials will help solve such use-cases. For > example, currently this doesn't work > {noformat} > set fs.s3a.secret.key=my_secret_key; > set fs.s3a.access.key=my_access.key; > {noformat} > Because metastore is unaware of the the keys. This doesn't work either > {noformat} > set fs.s3a.secret.key=my_secret_key; > set fs.s3a.access.key=my_access.key; > set metaconf:fs.s3a.secret.key=my_secret_key; > set metaconf:fs.s3a.access.key=my_access_key; > {noformat} > This is because only a certain metastore configurations defined in > {{HiveConf.MetaVars}} are allowed to be set by the user. If we enable the > above approaches we could potentially allow multiple S3 credentials on a > per-session level basis. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26481) Cleaner fails with FileNotFoundException
[ https://issues.apache.org/jira/browse/HIVE-26481?focusedWorklogId=802390&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-802390 ] ASF GitHub Bot logged work on HIVE-26481: - Author: ASF GitHub Bot Created on: 22/Aug/22 08:53 Start Date: 22/Aug/22 08:53 Worklog Time Spent: 10m Work Description: ayushtkn commented on code in PR #3531: URL: https://github.com/apache/hive/pull/3531#discussion_r951177365 ## ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java: ## @@ -1503,24 +1503,31 @@ public static Map getHdfsDirSnapshotsForCleaner(final Fil stack.push(fs.listStatusIterator(path)); while (!stack.isEmpty()) { RemoteIterator itr = stack.pop(); - while (itr.hasNext()) { -FileStatus fStatus = itr.next(); -Path fPath = fStatus.getPath(); -if (acidHiddenFileFilter.accept(fPath)) { - if (baseFileFilter.accept(fPath) || - deltaFileFilter.accept(fPath) || - deleteEventDeltaDirFilter.accept(fPath)) { -addToSnapshoot(dirToSnapshots, fPath); - } else { -if (fStatus.isDirectory()) { - stack.push(fs.listStatusIterator(fPath)); + try { +while (itr.hasNext()) { + FileStatus fStatus = itr.next(); + Path fPath = fStatus.getPath(); + if (acidHiddenFileFilter.accept(fPath)) { +if (baseFileFilter.accept(fPath) || +deltaFileFilter.accept(fPath) || +deleteEventDeltaDirFilter.accept(fPath)) { + addToSnapshoot(dirToSnapshots, fPath); } else { - // Found an original file - HdfsDirSnapshot hdfsDirSnapshot = addToSnapshoot(dirToSnapshots, fPath.getParent()); - hdfsDirSnapshot.addFile(fStatus); + if (fStatus.isDirectory()) { +stack.push(fs.listStatusIterator(fPath)); + } else { +// Found an original file +HdfsDirSnapshot hdfsDirSnapshot = addToSnapshoot(dirToSnapshots, fPath.getParent()); +hdfsDirSnapshot.addFile(fStatus); + } } } } + }catch(FileNotFoundException fne){ +//Ignore +//As current FS API doesn't provide the ability to supply a PathFilter to ignore the staging dirs, +// need to catch this exception Review Comment: There is something like RemoteIterators.filteringRemoteIterator(fs.listStatusIterator(path), ) provided by Hadoop for such use cases. https://github.com/apache/hadoop/blob/7f176d080c2576e512cbd401fce1a8d935b18ca7/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/functional/RemoteIterators.java#L160-L177 Issue Time Tracking --- Worklog Id: (was: 802390) Time Spent: 1h 40m (was: 1.5h) > Cleaner fails with FileNotFoundException > > > Key: HIVE-26481 > URL: https://issues.apache.org/jira/browse/HIVE-26481 > Project: Hive > Issue Type: Bug >Reporter: KIRTI RUGE >Priority: Major > Labels: pull-request-available > Time Spent: 1h 40m > Remaining Estimate: 0h > > The compaction fails when the Cleaner tried to remove a missing directory > from HDFS. > {code:java} > 2022-08-05 18:56:38,873 INFO org.apache.hadoop.hive.ql.txn.compactor.Cleaner: > [Cleaner-executor-thread-0]: Starting cleaning for > id:30,dbname:default,tableName:test_concur_compaction_minor,partName:null,state:�,type:MINOR,enqueueTime:0,start:0,properties:null,runAs:hive,tooManyAborts:false,hasOldAbort:false,highestWriteId:4,errorMessage:null,workerId: > null,initiatorId: null 2022-08-05 18:56:38,888 ERROR > org.apache.hadoop.hive.ql.txn.compactor.Cleaner: [Cleaner-executor-thread-0]: > Caught exception when cleaning, unable to complete cleaning of > id:30,dbname:default,tableName:test_concur_compaction_minor,partName:null,state:�,type:MINOR,enqueueTime:0,start:0,properties:null,runAs:hive,tooManyAborts:false,hasOldAbort:false,highestWriteId:4,errorMessage:null,workerId: > null,initiatorId: null java.io.FileNotFoundException: File > hdfs://ns1/warehouse/tablespace/managed/hive/test_concur_compaction_minor/.hive-staging_hive_2022-08-05_18-56-37_115_5049319600695911622-37 > does not exist. at > org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1275) > at > org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1249) > at > org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1194) > at > org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1190) > at > org.apache.hadoop.fs.FileSystemLinkResolver
[jira] [Work logged] (HIVE-26481) Cleaner fails with FileNotFoundException
[ https://issues.apache.org/jira/browse/HIVE-26481?focusedWorklogId=802388&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-802388 ] ASF GitHub Bot logged work on HIVE-26481: - Author: ASF GitHub Bot Created on: 22/Aug/22 08:51 Start Date: 22/Aug/22 08:51 Worklog Time Spent: 10m Work Description: ayushtkn commented on code in PR #3531: URL: https://github.com/apache/hive/pull/3531#discussion_r951177365 ## ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java: ## @@ -1503,24 +1503,31 @@ public static Map getHdfsDirSnapshotsForCleaner(final Fil stack.push(fs.listStatusIterator(path)); while (!stack.isEmpty()) { RemoteIterator itr = stack.pop(); - while (itr.hasNext()) { -FileStatus fStatus = itr.next(); -Path fPath = fStatus.getPath(); -if (acidHiddenFileFilter.accept(fPath)) { - if (baseFileFilter.accept(fPath) || - deltaFileFilter.accept(fPath) || - deleteEventDeltaDirFilter.accept(fPath)) { -addToSnapshoot(dirToSnapshots, fPath); - } else { -if (fStatus.isDirectory()) { - stack.push(fs.listStatusIterator(fPath)); + try { +while (itr.hasNext()) { + FileStatus fStatus = itr.next(); + Path fPath = fStatus.getPath(); + if (acidHiddenFileFilter.accept(fPath)) { +if (baseFileFilter.accept(fPath) || +deltaFileFilter.accept(fPath) || +deleteEventDeltaDirFilter.accept(fPath)) { + addToSnapshoot(dirToSnapshots, fPath); } else { - // Found an original file - HdfsDirSnapshot hdfsDirSnapshot = addToSnapshoot(dirToSnapshots, fPath.getParent()); - hdfsDirSnapshot.addFile(fStatus); + if (fStatus.isDirectory()) { +stack.push(fs.listStatusIterator(fPath)); + } else { +// Found an original file +HdfsDirSnapshot hdfsDirSnapshot = addToSnapshoot(dirToSnapshots, fPath.getParent()); +hdfsDirSnapshot.addFile(fStatus); + } } } } + }catch(FileNotFoundException fne){ +//Ignore +//As current FS API doesn't provide the ability to supply a PathFilter to ignore the staging dirs, +// need to catch this exception Review Comment: There is something like RemoteIterators.filteringRemoteIterator(fs.listStatusIterator(path), ) provided by Hadoop for such use cases. Issue Time Tracking --- Worklog Id: (was: 802388) Time Spent: 1.5h (was: 1h 20m) > Cleaner fails with FileNotFoundException > > > Key: HIVE-26481 > URL: https://issues.apache.org/jira/browse/HIVE-26481 > Project: Hive > Issue Type: Bug >Reporter: KIRTI RUGE >Priority: Major > Labels: pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > > The compaction fails when the Cleaner tried to remove a missing directory > from HDFS. > {code:java} > 2022-08-05 18:56:38,873 INFO org.apache.hadoop.hive.ql.txn.compactor.Cleaner: > [Cleaner-executor-thread-0]: Starting cleaning for > id:30,dbname:default,tableName:test_concur_compaction_minor,partName:null,state:�,type:MINOR,enqueueTime:0,start:0,properties:null,runAs:hive,tooManyAborts:false,hasOldAbort:false,highestWriteId:4,errorMessage:null,workerId: > null,initiatorId: null 2022-08-05 18:56:38,888 ERROR > org.apache.hadoop.hive.ql.txn.compactor.Cleaner: [Cleaner-executor-thread-0]: > Caught exception when cleaning, unable to complete cleaning of > id:30,dbname:default,tableName:test_concur_compaction_minor,partName:null,state:�,type:MINOR,enqueueTime:0,start:0,properties:null,runAs:hive,tooManyAborts:false,hasOldAbort:false,highestWriteId:4,errorMessage:null,workerId: > null,initiatorId: null java.io.FileNotFoundException: File > hdfs://ns1/warehouse/tablespace/managed/hive/test_concur_compaction_minor/.hive-staging_hive_2022-08-05_18-56-37_115_5049319600695911622-37 > does not exist. at > org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1275) > at > org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1249) > at > org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1194) > at > org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1190) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.listLocatedStatus(DistributedFileSystem.java:1208) > at org.apache.hadoop.fs.FileSystem.listLocatedStatus(Fi
[jira] [Work logged] (HIVE-26464) New credential provider for replicating to the cloud
[ https://issues.apache.org/jira/browse/HIVE-26464?focusedWorklogId=802386&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-802386 ] ASF GitHub Bot logged work on HIVE-26464: - Author: ASF GitHub Bot Created on: 22/Aug/22 08:49 Start Date: 22/Aug/22 08:49 Worklog Time Spent: 10m Work Description: pudidic commented on PR #3526: URL: https://github.com/apache/hive/pull/3526#issuecomment-1222048974 There's a Javadoc generation failure and few test failures. The timeout test is an unreliable one, so it's not relevant. My +1 is valid when the Javadoc issue is resolved, and the TestCliDriver[hybridgrace_hashjoin_2] – org.apache.hadoop.hive.cli.TestMiniTezCliDriver failure is resolved. Issue Time Tracking --- Worklog Id: (was: 802386) Time Spent: 0.5h (was: 20m) > New credential provider for replicating to the cloud > > > Key: HIVE-26464 > URL: https://issues.apache.org/jira/browse/HIVE-26464 > Project: Hive > Issue Type: Task > Components: HiveServer2, repl >Reporter: Peter Felker >Assignee: Peter Felker >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > In {{ReplDumpTask}}, if the following *new* config is provided in > {{HiveConf}}: > * {{hive.repl.cloud.credential.provider.path}} > then the HS2 credstore URI scheme, contained by {{HiveConf}} with key > {{hadoop.security.credential.provider.path}}, should be updated so that it > will start with new scheme: {{hiverepljceks}}. For instance: > {code}jceks://file/path/to/credstore/creds.localjceks{code} > will become: > {code}hiverepljceks://file/path/to/credstore/creds.localjceks{code} > This new scheme, {{hiverepljceks}}, will make Hadoop to use a *new* > credential provider, which will do the following: > # Load the HS2 keystore file, defined by key > {{hadoop.security.credential.provider.path}} > # Gets a password from the HS2 keystore file, with key: > {{hive.repl.cloud.credential.provider.password}} > # This password will be used to load another keystore file, located on HDFS > and specified by the new config mentioned before: > {{hive.repl.cloud.credential.provider.path}}. This contains the cloud > credentials for the Hive cloud replication. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26363) Time logged during repldump and replload per table is not in readable format
[ https://issues.apache.org/jira/browse/HIVE-26363?focusedWorklogId=802384&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-802384 ] ASF GitHub Bot logged work on HIVE-26363: - Author: ASF GitHub Bot Created on: 22/Aug/22 08:45 Start Date: 22/Aug/22 08:45 Worklog Time Spent: 10m Work Description: pudidic merged PR #3541: URL: https://github.com/apache/hive/pull/3541 Issue Time Tracking --- Worklog Id: (was: 802384) Time Spent: 2h 20m (was: 2h 10m) > Time logged during repldump and replload per table is not in readable format > > > Key: HIVE-26363 > URL: https://issues.apache.org/jira/browse/HIVE-26363 > Project: Hive > Issue Type: Improvement > Components: HiveServer2, repl >Affects Versions: 4.0.0 >Reporter: Imran >Assignee: Rakshith C >Priority: Minor > Labels: pull-request-available > Time Spent: 2h 20m > Remaining Estimate: 0h > > During replDump and replLoad we capture time take for each activity in > hive.log file. This is captured in milliseconds which becomes difficult to > read during debug activity, this ticket is raised to change the time logged > in hive.log in UTC format. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26363) Time logged during repldump and replload per table is not in readable format
[ https://issues.apache.org/jira/browse/HIVE-26363?focusedWorklogId=802383&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-802383 ] ASF GitHub Bot logged work on HIVE-26363: - Author: ASF GitHub Bot Created on: 22/Aug/22 08:44 Start Date: 22/Aug/22 08:44 Worklog Time Spent: 10m Work Description: pudidic commented on PR #3541: URL: https://github.com/apache/hive/pull/3541#issuecomment-1222042812 Looks good to me. As it's a trivial change, I'll just merge it. Issue Time Tracking --- Worklog Id: (was: 802383) Time Spent: 2h 10m (was: 2h) > Time logged during repldump and replload per table is not in readable format > > > Key: HIVE-26363 > URL: https://issues.apache.org/jira/browse/HIVE-26363 > Project: Hive > Issue Type: Improvement > Components: HiveServer2, repl >Affects Versions: 4.0.0 >Reporter: Imran >Assignee: Rakshith C >Priority: Minor > Labels: pull-request-available > Time Spent: 2h 10m > Remaining Estimate: 0h > > During replDump and replLoad we capture time take for each activity in > hive.log file. This is captured in milliseconds which becomes difficult to > read during debug activity, this ticket is raised to change the time logged > in hive.log in UTC format. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26472) Concurrent UPDATEs can cause duplicate rows
[ https://issues.apache.org/jira/browse/HIVE-26472?focusedWorklogId=802368&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-802368 ] ASF GitHub Bot logged work on HIVE-26472: - Author: ASF GitHub Bot Created on: 22/Aug/22 08:10 Start Date: 22/Aug/22 08:10 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #3524: URL: https://github.com/apache/hive/pull/3524#issuecomment-1222002269 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=3524) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3524&resolved=false&types=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3524&resolved=false&types=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3524&resolved=false&types=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3524&resolved=false&types=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3524&resolved=false&types=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3524&resolved=false&types=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3524&resolved=false&types=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3524&resolved=false&types=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3524&resolved=false&types=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3524&resolved=false&types=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3524&resolved=false&types=CODE_SMELL) [16 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3524&resolved=false&types=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3524&metric=coverage&view=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3524&metric=duplicated_lines_density&view=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 802368) Time Spent: 1.5h (was: 1h 20m) > Concurrent UPDATEs can cause duplicate rows > --- > > Key: HIVE-26472 > URL: https://issues.apache.org/jira/browse/HIVE-26472 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 4.0.0-alpha-1 >Reporter: John Sherman >Assignee: John Sherman >Priority: Critical > Labels: pull-request-available > Attachments: debug.diff > > Time Spent: 1.5h > Remaining Estimate: 0h > > Concurrent UPDATEs to the same table can cause duplicate rows when the > following occurs: > Two UPDATEs get assigned txnIds and writeIds like this: > UPDATE #1 = txnId: 100 writeId: 50 <--- commits first > UPDATE #2 = txnId: 101 writeId: 49 > To replicate the issue: > I applied the attach debug.diff patch which adds hive.lock.sleep.writeid > (which controls the amount to sleep before acquiring a writeId) and > hive.lock.sleep.post.writeid (which controls the amount to sleep after > acquiring a writeId). > {code:jav
[jira] [Work logged] (HIVE-26363) Time logged during repldump and replload per table is not in readable format
[ https://issues.apache.org/jira/browse/HIVE-26363?focusedWorklogId=802359&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-802359 ] ASF GitHub Bot logged work on HIVE-26363: - Author: ASF GitHub Bot Created on: 22/Aug/22 07:36 Start Date: 22/Aug/22 07:36 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #3541: URL: https://github.com/apache/hive/pull/3541#issuecomment-1221967884 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=3541) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3541&resolved=false&types=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3541&resolved=false&types=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3541&resolved=false&types=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3541&resolved=false&types=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3541&resolved=false&types=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3541&resolved=false&types=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3541&resolved=false&types=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3541&resolved=false&types=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3541&resolved=false&types=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3541&resolved=false&types=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3541&resolved=false&types=CODE_SMELL) [10 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3541&resolved=false&types=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3541&metric=coverage&view=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3541&metric=duplicated_lines_density&view=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 802359) Time Spent: 2h (was: 1h 50m) > Time logged during repldump and replload per table is not in readable format > > > Key: HIVE-26363 > URL: https://issues.apache.org/jira/browse/HIVE-26363 > Project: Hive > Issue Type: Improvement > Components: HiveServer2, repl >Affects Versions: 4.0.0 >Reporter: Imran >Assignee: Rakshith C >Priority: Minor > Labels: pull-request-available > Time Spent: 2h > Remaining Estimate: 0h > > During replDump and replLoad we capture time take for each activity in > hive.log file. This is captured in milliseconds which becomes difficult to > read during debug activity, this ticket is raised to change the time logged > in hive.log in UTC format. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-26490) Iceberg: Residual expression is constructed for the task from multiple places causing CPU burn
[ https://issues.apache.org/jira/browse/HIVE-26490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan updated HIVE-26490: Labels: iceberg performance (was: performance) > Iceberg: Residual expression is constructed for the task from multiple places > causing CPU burn > -- > > Key: HIVE-26490 > URL: https://issues.apache.org/jira/browse/HIVE-26490 > Project: Hive > Issue Type: Bug >Reporter: Rajesh Balamohan >Priority: Major > Labels: iceberg, performance > Attachments: Screenshot 2022-08-22 at 12.58.47 PM.jpg > > > "HiveIcebergInputFormat.residualForTask(task, job)" is invoked from multiple > places causing CPU burn. > !Screenshot 2022-08-22 at 12.58.47 PM.jpg|width=918,height=932! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (HIVE-26483) Use DDL_NO_LOCK when running iceberg CTAS query
[ https://issues.apache.org/jira/browse/HIVE-26483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] László Pintér resolved HIVE-26483. -- Resolution: Fixed > Use DDL_NO_LOCK when running iceberg CTAS query > --- > > Key: HIVE-26483 > URL: https://issues.apache.org/jira/browse/HIVE-26483 > Project: Hive > Issue Type: Bug >Reporter: László Pintér >Assignee: László Pintér >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > HIVE-26244 introduced locking when running CTAS queries. Iceberg already had > an explicit locking implemented for CTAS queries and after HIVE-26244 the two > locking mechanisms started colliding resulting in a deadlock. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HIVE-26483) Use DDL_NO_LOCK when running iceberg CTAS query
[ https://issues.apache.org/jira/browse/HIVE-26483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17582776#comment-17582776 ] László Pintér commented on HIVE-26483: -- Merged into master. Thanks, [~dkuzmenko] for the review! > Use DDL_NO_LOCK when running iceberg CTAS query > --- > > Key: HIVE-26483 > URL: https://issues.apache.org/jira/browse/HIVE-26483 > Project: Hive > Issue Type: Bug >Reporter: László Pintér >Assignee: László Pintér >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > HIVE-26244 introduced locking when running CTAS queries. Iceberg already had > an explicit locking implemented for CTAS queries and after HIVE-26244 the two > locking mechanisms started colliding resulting in a deadlock. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26483) Use DDL_NO_LOCK when running iceberg CTAS query
[ https://issues.apache.org/jira/browse/HIVE-26483?focusedWorklogId=802346&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-802346 ] ASF GitHub Bot logged work on HIVE-26483: - Author: ASF GitHub Bot Created on: 22/Aug/22 07:01 Start Date: 22/Aug/22 07:01 Worklog Time Spent: 10m Work Description: lcspinter merged PR #3533: URL: https://github.com/apache/hive/pull/3533 Issue Time Tracking --- Worklog Id: (was: 802346) Time Spent: 1h 10m (was: 1h) > Use DDL_NO_LOCK when running iceberg CTAS query > --- > > Key: HIVE-26483 > URL: https://issues.apache.org/jira/browse/HIVE-26483 > Project: Hive > Issue Type: Bug >Reporter: László Pintér >Assignee: László Pintér >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > HIVE-26244 introduced locking when running CTAS queries. Iceberg already had > an explicit locking implemented for CTAS queries and after HIVE-26244 the two > locking mechanisms started colliding resulting in a deadlock. -- This message was sent by Atlassian Jira (v8.20.10#820010)