[jira] [Assigned] (HIVE-27195) Drop table if Exists . fails during authorization for temporary tables
[ https://issues.apache.org/jira/browse/HIVE-27195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Riju Trivedi reassigned HIVE-27195: --- > Drop table if Exists . fails during authorization for > temporary tables > --- > > Key: HIVE-27195 > URL: https://issues.apache.org/jira/browse/HIVE-27195 > Project: Hive > Issue Type: Bug >Reporter: Riju Trivedi >Assignee: Riju Trivedi >Priority: Major > > https://issues.apache.org/jira/browse/HIVE-20051 handles skipping > authorization for temporary tables. But still, the drop table if Exists fails > with HiveAccessControlException. > Steps to Repro: > {code:java} > use test; CREATE TEMPORARY TABLE temp_table (id int); > drop table if exists test.temp_table; > Error: Error while compiling statement: FAILED: HiveAccessControlException > Permission denied: user [rtrivedi] does not have [DROP] privilege on > [test/temp_table] (state=42000,code=4) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-27179) HS2 WebUI throws NPE when JspFactory loaded from jetty-runner
[ https://issues.apache.org/jira/browse/HIVE-27179?focusedWorklogId=853828=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853828 ] ASF GitHub Bot logged work on HIVE-27179: - Author: ASF GitHub Bot Created on: 30/Mar/23 04:49 Start Date: 30/Mar/23 04:49 Worklog Time Spent: 10m Work Description: dengzhhu653 commented on code in PR #4164: URL: https://github.com/apache/hive/pull/4164#discussion_r1152729260 ## service/src/java/org/apache/hive/service/server/HiveServer2.java: ## @@ -133,6 +133,8 @@ import com.google.common.util.concurrent.SettableFuture; import com.google.common.util.concurrent.ThreadFactoryBuilder; +import javax.servlet.jsp.JspFactory; Review Comment: Hi @saihemanth-cloudera, the jetty-runner jar contains the classes: ``` [Loaded javax.servlet.jsp.JspFactory from file:/opt/xxx/jetty-runner-9.4.48.v20220622.jar] [Loaded javax.servlet.jsp.JspContext from file:/opt/xxx/jetty-runner-9.4.48.v20220622.jar] [Loaded javax.servlet.jsp.PageContext from file:/opt/xxx/jetty-runner-9.4.48.v20220622.jar] ``` In fact this is where the class conflict happens. Issue Time Tracking --- Worklog Id: (was: 853828) Time Spent: 50m (was: 40m) > HS2 WebUI throws NPE when JspFactory loaded from jetty-runner > - > > Key: HIVE-27179 > URL: https://issues.apache.org/jira/browse/HIVE-27179 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Zhihua Deng >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > In HIVE-17088{*},{*} we resolved a NPE thrown from HS2 WebUI by introducing > javax.servlet.jsp-api. It works as expected when the javax.servlet.jsp-api > jar prevails jetty-runner jar, but things can be different in some > environments, it still throws NPE when opening the HS2 web: > {noformat} > java.lang.NullPointerException > at > org.apache.hive.generated.hiveserver2.hiveserver2_jsp._jspService(hiveserver2_jsp.java:286) > > at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:71) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) > at > org.eclipse.jetty.servlet.ServletHolder$NotAsync.service(ServletHolder.java:1443) > > at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:791) > at > org.eclipse.jetty.servlet.ServletHandler$ChainEnd.doFilter(ServletHandler.java:1626) > ...{noformat} > The jetty-runner JspFactory.getDefaultFactory() just returns null. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-27150) Drop single partition can also support direct sql
[ https://issues.apache.org/jira/browse/HIVE-27150?focusedWorklogId=853827=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853827 ] ASF GitHub Bot logged work on HIVE-27150: - Author: ASF GitHub Bot Created on: 30/Mar/23 04:17 Start Date: 30/Mar/23 04:17 Worklog Time Spent: 10m Work Description: saihemanth-cloudera commented on code in PR #4123: URL: https://github.com/apache/hive/pull/4123#discussion_r1152712297 ## standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/RawStore.java: ## @@ -459,16 +459,15 @@ boolean doesPartitionExist(String catName, String dbName, String tableName, * @param catName catalog name. * @param dbName database name. * @param tableName table name. - * @param part_vals list of partition values. + * @param partName partition name. * @return true if the partition was dropped. * @throws MetaException Error accessing the RDBMS. * @throws NoSuchObjectException no partition matching this description exists * @throws InvalidObjectException error dropping the statistics for the partition * @throws InvalidInputException error dropping the statistics for the partition */ - boolean dropPartition(String catName, String dbName, String tableName, - List part_vals) throws MetaException, NoSuchObjectException, InvalidObjectException, - InvalidInputException; + boolean dropPartition(String catName, String dbName, String tableName, String partName) Review Comment: I would favor this change instead of passing the table into the drop partition API in the object store. Issue Time Tracking --- Worklog Id: (was: 853827) Time Spent: 2h 40m (was: 2.5h) > Drop single partition can also support direct sql > - > > Key: HIVE-27150 > URL: https://issues.apache.org/jira/browse/HIVE-27150 > Project: Hive > Issue Type: Improvement > Components: Hive >Reporter: Wechar >Assignee: Wechar >Priority: Major > Labels: pull-request-available > Time Spent: 2h 40m > Remaining Estimate: 0h > > *Background:* > [HIVE-6980|https://issues.apache.org/jira/browse/HIVE-6980] supports direct > sql for drop_partitions, we can reuse this huge improvement in drop_partition. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-27179) HS2 WebUI throws NPE when JspFactory loaded from jetty-runner
[ https://issues.apache.org/jira/browse/HIVE-27179?focusedWorklogId=853826=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853826 ] ASF GitHub Bot logged work on HIVE-27179: - Author: ASF GitHub Bot Created on: 30/Mar/23 04:12 Start Date: 30/Mar/23 04:12 Worklog Time Spent: 10m Work Description: saihemanth-cloudera commented on code in PR #4164: URL: https://github.com/apache/hive/pull/4164#discussion_r1152709966 ## service/src/java/org/apache/hive/service/server/HiveServer2.java: ## @@ -133,6 +133,8 @@ import com.google.common.util.concurrent.SettableFuture; import com.google.common.util.concurrent.ThreadFactoryBuilder; +import javax.servlet.jsp.JspFactory; Review Comment: I see that you have removed javax.servlet dependency from root and service's pom.xml. I'm wondering where this dependency is coming from. Issue Time Tracking --- Worklog Id: (was: 853826) Time Spent: 40m (was: 0.5h) > HS2 WebUI throws NPE when JspFactory loaded from jetty-runner > - > > Key: HIVE-27179 > URL: https://issues.apache.org/jira/browse/HIVE-27179 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Zhihua Deng >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > In HIVE-17088{*},{*} we resolved a NPE thrown from HS2 WebUI by introducing > javax.servlet.jsp-api. It works as expected when the javax.servlet.jsp-api > jar prevails jetty-runner jar, but things can be different in some > environments, it still throws NPE when opening the HS2 web: > {noformat} > java.lang.NullPointerException > at > org.apache.hive.generated.hiveserver2.hiveserver2_jsp._jspService(hiveserver2_jsp.java:286) > > at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:71) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) > at > org.eclipse.jetty.servlet.ServletHolder$NotAsync.service(ServletHolder.java:1443) > > at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:791) > at > org.eclipse.jetty.servlet.ServletHandler$ChainEnd.doFilter(ServletHandler.java:1626) > ...{noformat} > The jetty-runner JspFactory.getDefaultFactory() just returns null. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-27192) Use normal import instead of shaded import in TestSchemaToolCatalogOps.java
[ https://issues.apache.org/jira/browse/HIVE-27192?focusedWorklogId=853825=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853825 ] ASF GitHub Bot logged work on HIVE-27192: - Author: ASF GitHub Bot Created on: 30/Mar/23 04:02 Start Date: 30/Mar/23 04:02 Worklog Time Spent: 10m Work Description: SourabhBadhya commented on code in PR #4169: URL: https://github.com/apache/hive/pull/4169#discussion_r1152705138 ## itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/tools/schematool/TestSchemaToolCatalogOps.java: ## @@ -35,7 +35,7 @@ import org.apache.hadoop.hive.metastore.client.builder.PartitionBuilder; import org.apache.hadoop.hive.metastore.client.builder.TableBuilder; import org.apache.hadoop.hive.metastore.conf.MetastoreConf; -import org.apache.hive.com.google.common.io.Files; +import com.google.common.io.Files; Review Comment: nit: After the removal of shading pattern, this does not follow alphabetical order of imports Issue Time Tracking --- Worklog Id: (was: 853825) Time Spent: 0.5h (was: 20m) > Use normal import instead of shaded import in TestSchemaToolCatalogOps.java > --- > > Key: HIVE-27192 > URL: https://issues.apache.org/jira/browse/HIVE-27192 > Project: Hive > Issue Type: Improvement >Reporter: Zoltán Rátkai >Assignee: Zoltán Rátkai >Priority: Minor > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-22383) `alterPartitions` is invoked twice during dynamic partition load causing runtime delay
[ https://issues.apache.org/jira/browse/HIVE-22383?focusedWorklogId=853814=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853814 ] ASF GitHub Bot logged work on HIVE-22383: - Author: ASF GitHub Bot Created on: 30/Mar/23 02:18 Start Date: 30/Mar/23 02:18 Worklog Time Spent: 10m Work Description: rbalamohan commented on code in PR #4161: URL: https://github.com/apache/hive/pull/4161#discussion_r1152661303 ## ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java: ## @@ -2951,9 +2951,11 @@ private void setStatsPropAndAlterPartitions(boolean resetStatistics, Table tbl, validWriteIdList = tableSnapshot.getValidWriteIdList(); writeId = tableSnapshot.getWriteId(); } -getSynchronizedMSC().alter_partitions(tbl.getCatName(), tbl.getDbName(), tbl.getTableName(), - partitions.stream().map(Partition::getTPartition).collect(Collectors.toList()), -ec, validWriteIdList, writeId); +if (!conf.getBoolVar(ConfVars.HIVESTATSAUTOGATHER)){ Review Comment: Should this be in the starting of the method itself? Issue Time Tracking --- Worklog Id: (was: 853814) Time Spent: 50m (was: 40m) > `alterPartitions` is invoked twice during dynamic partition load causing > runtime delay > -- > > Key: HIVE-22383 > URL: https://issues.apache.org/jira/browse/HIVE-22383 > Project: Hive > Issue Type: Bug >Reporter: Rajesh Balamohan >Assignee: Dmitriy Fingerman >Priority: Major > Labels: performance, pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > First invocation in {{Hive::loadDynamicPartitions}} > https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L2978 > https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L2638 > Second invocation in {{BasicStatsTask::aggregateStats}} > https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsTask.java#L335 > This leads to good amount of delay in dynamic partition loading. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HIVE-27194) Support expression in limit and offset clauses
[ https://issues.apache.org/jira/browse/HIVE-27194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vamshi kolanu reassigned HIVE-27194: > Support expression in limit and offset clauses > -- > > Key: HIVE-27194 > URL: https://issues.apache.org/jira/browse/HIVE-27194 > Project: Hive > Issue Type: Task > Components: Hive >Reporter: vamshi kolanu >Assignee: vamshi kolanu >Priority: Major > > As part of this task, support expressions in both limit and offset clauses. > Currently, these clauses are only supporting integers. > For example: The following expressions will be supported after this change. > 1. select key from (select * from src limit (1+2*3)) q1; > 2. select key from (select * from src limit (1+2*3) offset (3*4*5)) q1; -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HIVE-22383) `alterPartitions` is invoked twice during dynamic partition load causing runtime delay
[ https://issues.apache.org/jira/browse/HIVE-22383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17706649#comment-17706649 ] Dmitriy Fingerman commented on HIVE-22383: -- Hi [~rajesh.balamohan], could you please review the PR? > `alterPartitions` is invoked twice during dynamic partition load causing > runtime delay > -- > > Key: HIVE-22383 > URL: https://issues.apache.org/jira/browse/HIVE-22383 > Project: Hive > Issue Type: Bug >Reporter: Rajesh Balamohan >Assignee: Dmitriy Fingerman >Priority: Major > Labels: performance, pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > First invocation in {{Hive::loadDynamicPartitions}} > https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L2978 > https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L2638 > Second invocation in {{BasicStatsTask::aggregateStats}} > https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsTask.java#L335 > This leads to good amount of delay in dynamic partition loading. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-27165) PART_COL_STATS metastore query not hitting the index
[ https://issues.apache.org/jira/browse/HIVE-27165?focusedWorklogId=853808=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853808 ] ASF GitHub Bot logged work on HIVE-27165: - Author: ASF GitHub Bot Created on: 30/Mar/23 00:25 Start Date: 30/Mar/23 00:25 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #4141: URL: https://github.com/apache/hive/pull/4141#issuecomment-1489516433 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive=4141) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=4141=false=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=4141=false=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive=4141=false=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=4141=false=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=4141=false=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=4141=false=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4141=false=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4141=false=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4141=false=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive=4141=false=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=4141=false=CODE_SMELL) [0 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive=4141=false=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive=4141=coverage=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive=4141=duplicated_lines_density=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 853808) Time Spent: 50m (was: 40m) > PART_COL_STATS metastore query not hitting the index > > > Key: HIVE-27165 > URL: https://issues.apache.org/jira/browse/HIVE-27165 > Project: Hive > Issue Type: Improvement >Reporter: Hongdan Zhu >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > The query located here: > [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java#L1029-L1032] > is not hitting an index. The index contains CAT_NAME whereas this query does > not. This was a change made in Hive 3.0, I think. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-27135) AcidUtils#getHdfsDirSnapshots() throws FNFE when a directory is removed in HDFS
[ https://issues.apache.org/jira/browse/HIVE-27135?focusedWorklogId=853807=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853807 ] ASF GitHub Bot logged work on HIVE-27135: - Author: ASF GitHub Bot Created on: 29/Mar/23 23:47 Start Date: 29/Mar/23 23:47 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #4114: URL: https://github.com/apache/hive/pull/4114#issuecomment-1489482725 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive=4114) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=4114=false=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=4114=false=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive=4114=false=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=4114=false=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=4114=false=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=4114=false=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4114=false=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4114=false=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4114=false=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive=4114=false=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=4114=false=CODE_SMELL) [0 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive=4114=false=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive=4114=coverage=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive=4114=duplicated_lines_density=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 853807) Time Spent: 6h 40m (was: 6.5h) > AcidUtils#getHdfsDirSnapshots() throws FNFE when a directory is removed in > HDFS > --- > > Key: HIVE-27135 > URL: https://issues.apache.org/jira/browse/HIVE-27135 > Project: Hive > Issue Type: Bug >Reporter: Dayakar M >Assignee: Dayakar M >Priority: Major > Labels: pull-request-available > Time Spent: 6h 40m > Remaining Estimate: 0h > > AcidUtils#getHdfsDirSnapshots() throws FileNotFoundException when a directory > is removed in HDFS while fetching HDFS Snapshots. > Below testcode can be used to reproduce this issue. > {code:java} > @Test > public void > testShouldNotThrowFNFEWhenHiveStagingDirectoryIsRemovedWhileFetchingHDFSSnapshots() > throws Exception { > MockFileSystem fs = new MockFileSystem(new HiveConf(), > new MockFile("mock:/tbl/part1/.hive-staging_dir/-ext-10002", 500, new > byte[0]), > new MockFile("mock:/tbl/part2/.hive-staging_dir", 500, new byte[0]), > new MockFile("mock:/tbl/part1/_tmp_space.db", 500, new byte[0]), > new MockFile("mock:/tbl/part1/delta_1_1/bucket--", 500, new > byte[0])); > Path path = new MockPath(fs, "/tbl"); > Path stageDir = new MockPath(fs, "mock:/tbl/part1/.hive-staging_dir"); > FileSystem mockFs = spy(fs); > Mockito.doThrow(new >
[jira] [Updated] (HIVE-27136) Backport HIVE-27129 to branch-3
[ https://issues.apache.org/jira/browse/HIVE-27136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junlin Zeng updated HIVE-27136: --- Description: Create this ticket to track backport # HIVE-27129 > Backport HIVE-27129 to branch-3 > --- > > Key: HIVE-27136 > URL: https://issues.apache.org/jira/browse/HIVE-27136 > Project: Hive > Issue Type: Improvement >Reporter: Junlin Zeng >Assignee: Junlin Zeng >Priority: Major > > Create this ticket to track backport > # HIVE-27129 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work started] (HIVE-27128) Exception "Can't finish byte read from uncompressed stream DATA position" when querying ORC table
[ https://issues.apache.org/jira/browse/HIVE-27128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-27128 started by Dmitriy Fingerman. > Exception "Can't finish byte read from uncompressed stream DATA position" > when querying ORC table > - > > Key: HIVE-27128 > URL: https://issues.apache.org/jira/browse/HIVE-27128 > Project: Hive > Issue Type: Bug >Reporter: Dmitriy Fingerman >Assignee: Dmitriy Fingerman >Priority: Critical > > Exception happening when querying an ORC table: > {code:java} > Caused by: java.io.EOFException: Can't finish byte read from uncompressed > stream DATA position: 393216 length: 393216 range: 23 offset: 376832 > position: 16384 limit: 16384 > at > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays(TreeReaderFactory.java:1550) > at > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.readOrcByteArrays(TreeReaderFactory.java:1566) > at > org.apache.orc.impl.TreeReaderFactory$StringDirectTreeReader.nextVector(TreeReaderFactory.java:1662) > at > org.apache.orc.impl.TreeReaderFactory$StringTreeReader.nextVector(TreeReaderFactory.java:1508) > at > org.apache.hadoop.hive.ql.io.orc.encoded.EncodedTreeReaderFactory$StringStreamReader.nextVector(EncodedTreeReaderFactory.java:305) > at > org.apache.hadoop.hive.llap.io.decode.OrcEncodedDataConsumer.decodeBatch(OrcEncodedDataConsumer.java:196) > at > org.apache.hadoop.hive.llap.io.decode.OrcEncodedDataConsumer.decodeBatch(OrcEncodedDataConsumer.java:66) > at > org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer.consumeData(EncodedDataConsumer.java:122) > at > org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer.consumeData(EncodedDataConsumer.java:42) > at > org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:608) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:434) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:282) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:279) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:279) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:118) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at > org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer$CpuRecordingCallable.call(EncodedDataConsumer.java:88) > at > org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer$CpuRecordingCallable.call(EncodedDataConsumer.java:73) > {code} > I created a q-test that reproduces this issue: > [https://github.com/difin/hive/commits/orc_read_err_qtest] > This issue happens in Hive starting from the commit that upgraded ORC version > in Hive to ORC 1.6.7. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-26949) Backport HIVE-26071 to branch-3
[ https://issues.apache.org/jira/browse/HIVE-26949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-26949: -- Labels: pull-request-available (was: ) > Backport HIVE-26071 to branch-3 > --- > > Key: HIVE-26949 > URL: https://issues.apache.org/jira/browse/HIVE-26949 > Project: Hive > Issue Type: Improvement > Components: Metastore, Standalone Metastore >Reporter: Vihang Karajgaonkar >Assignee: Junlin Zeng >Priority: Blocker > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Creating this ticket to backport HIVE-26071 to branch-3. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26949) Backport HIVE-26071 to branch-3
[ https://issues.apache.org/jira/browse/HIVE-26949?focusedWorklogId=853797=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853797 ] ASF GitHub Bot logged work on HIVE-26949: - Author: ASF GitHub Bot Created on: 29/Mar/23 21:00 Start Date: 29/Mar/23 21:00 Worklog Time Spent: 10m Work Description: junlinzeng-db opened a new pull request, #4172: URL: https://github.com/apache/hive/pull/4172 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? Issue Time Tracking --- Worklog Id: (was: 853797) Remaining Estimate: 0h Time Spent: 10m > Backport HIVE-26071 to branch-3 > --- > > Key: HIVE-26949 > URL: https://issues.apache.org/jira/browse/HIVE-26949 > Project: Hive > Issue Type: Improvement > Components: Metastore, Standalone Metastore >Reporter: Vihang Karajgaonkar >Assignee: Junlin Zeng >Priority: Blocker > Time Spent: 10m > Remaining Estimate: 0h > > Creating this ticket to backport HIVE-26071 to branch-3. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HIVE-27128) Exception "Can't finish byte read from uncompressed stream DATA position" when querying ORC table
[ https://issues.apache.org/jira/browse/HIVE-27128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17706598#comment-17706598 ] Dmitriy Fingerman commented on HIVE-27128: -- This can be fixed after HIVE-26809 - 'Upgrade ORC to 1.8.3' is fixed because ORC 1.8.3 contains fix ORC-1393 required to fix this issue. > Exception "Can't finish byte read from uncompressed stream DATA position" > when querying ORC table > - > > Key: HIVE-27128 > URL: https://issues.apache.org/jira/browse/HIVE-27128 > Project: Hive > Issue Type: Bug >Reporter: Dmitriy Fingerman >Assignee: Dmitriy Fingerman >Priority: Critical > > Exception happening when querying an ORC table: > {code:java} > Caused by: java.io.EOFException: Can't finish byte read from uncompressed > stream DATA position: 393216 length: 393216 range: 23 offset: 376832 > position: 16384 limit: 16384 > at > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays(TreeReaderFactory.java:1550) > at > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.readOrcByteArrays(TreeReaderFactory.java:1566) > at > org.apache.orc.impl.TreeReaderFactory$StringDirectTreeReader.nextVector(TreeReaderFactory.java:1662) > at > org.apache.orc.impl.TreeReaderFactory$StringTreeReader.nextVector(TreeReaderFactory.java:1508) > at > org.apache.hadoop.hive.ql.io.orc.encoded.EncodedTreeReaderFactory$StringStreamReader.nextVector(EncodedTreeReaderFactory.java:305) > at > org.apache.hadoop.hive.llap.io.decode.OrcEncodedDataConsumer.decodeBatch(OrcEncodedDataConsumer.java:196) > at > org.apache.hadoop.hive.llap.io.decode.OrcEncodedDataConsumer.decodeBatch(OrcEncodedDataConsumer.java:66) > at > org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer.consumeData(EncodedDataConsumer.java:122) > at > org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer.consumeData(EncodedDataConsumer.java:42) > at > org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:608) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:434) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:282) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:279) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:279) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:118) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at > org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer$CpuRecordingCallable.call(EncodedDataConsumer.java:88) > at > org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer$CpuRecordingCallable.call(EncodedDataConsumer.java:73) > {code} > I created a q-test that reproduces this issue: > [https://github.com/difin/hive/commits/orc_read_err_qtest] > This issue happens in Hive starting from the commit that upgraded ORC version > in Hive to ORC 1.6.7. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26997) Iceberg: Vectorization gets disabled at runtime in merge-into statements
[ https://issues.apache.org/jira/browse/HIVE-26997?focusedWorklogId=853789=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853789 ] ASF GitHub Bot logged work on HIVE-26997: - Author: ASF GitHub Bot Created on: 29/Mar/23 20:18 Start Date: 29/Mar/23 20:18 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #4162: URL: https://github.com/apache/hive/pull/4162#issuecomment-1489249100 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive=4162) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=4162=false=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=4162=false=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive=4162=false=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=4162=false=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=4162=false=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=4162=false=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4162=false=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4162=false=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4162=false=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive=4162=false=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=4162=false=CODE_SMELL) [0 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive=4162=false=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive=4162=coverage=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive=4162=duplicated_lines_density=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 853789) Time Spent: 1.5h (was: 1h 20m) > Iceberg: Vectorization gets disabled at runtime in merge-into statements > > > Key: HIVE-26997 > URL: https://issues.apache.org/jira/browse/HIVE-26997 > Project: Hive > Issue Type: Improvement > Components: Iceberg integration >Reporter: Rajesh Balamohan >Assignee: Zsolt Miskolczi >Priority: Major > Labels: pull-request-available > Attachments: explain_merge_into.txt > > Time Spent: 1.5h > Remaining Estimate: 0h > > *Query:* > Think of "ssv" table as a table containing trickle feed data in the following > query. "store_sales_delete_1" is the destination table. > > {noformat} > MERGE INTO tpcds_1000_iceberg_mor_v4.store_sales_delete_1 t USING > tpcds_1000_update.ssv s ON (t.ss_item_sk = s.ss_item_sk > > AND t.ss_customer_sk=s.ss_customer_sk > > AND t.ss_sold_date_sk = "2451181" > > AND ((Floor((s.ss_item_sk) / 1000) * 1000) BETWEEN 1000 AND > 2000) > > AND s.ss_ext_discount_amt < 0.0)
[jira] [Work logged] (HIVE-27165) PART_COL_STATS metastore query not hitting the index
[ https://issues.apache.org/jira/browse/HIVE-27165?focusedWorklogId=853788=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853788 ] ASF GitHub Bot logged work on HIVE-27165: - Author: ASF GitHub Bot Created on: 29/Mar/23 20:18 Start Date: 29/Mar/23 20:18 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #4141: URL: https://github.com/apache/hive/pull/4141#issuecomment-1489248563 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive=4141) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=4141=false=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=4141=false=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive=4141=false=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=4141=false=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=4141=false=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=4141=false=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4141=false=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4141=false=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4141=false=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive=4141=false=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=4141=false=CODE_SMELL) [0 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive=4141=false=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive=4141=coverage=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive=4141=duplicated_lines_density=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 853788) Time Spent: 40m (was: 0.5h) > PART_COL_STATS metastore query not hitting the index > > > Key: HIVE-27165 > URL: https://issues.apache.org/jira/browse/HIVE-27165 > Project: Hive > Issue Type: Improvement >Reporter: Hongdan Zhu >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > The query located here: > [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java#L1029-L1032] > is not hitting an index. The index contains CAT_NAME whereas this query does > not. This was a change made in Hive 3.0, I think. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26800) Backport HIVE-21755 : Upgrading SQL server backed metastore when changing
[ https://issues.apache.org/jira/browse/HIVE-26800?focusedWorklogId=853773=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853773 ] ASF GitHub Bot logged work on HIVE-26800: - Author: ASF GitHub Bot Created on: 29/Mar/23 18:59 Start Date: 29/Mar/23 18:59 Worklog Time Spent: 10m Work Description: amanraj2520 commented on PR #4021: URL: https://github.com/apache/hive/pull/4021#issuecomment-1489140678 @abstractdog @zabetak @vihangk1 Can you please approve and merge this. Issue Time Tracking --- Worklog Id: (was: 853773) Time Spent: 0.5h (was: 20m) > Backport HIVE-21755 : Upgrading SQL server backed metastore when changing > - > > Key: HIVE-26800 > URL: https://issues.apache.org/jira/browse/HIVE-26800 > Project: Hive > Issue Type: Sub-task >Reporter: Aman Raj >Assignee: Aman Raj >Priority: Major > Labels: hive-3.2.0-must, pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26900) Error message not representing the correct line number with a syntax error in a HQL File
[ https://issues.apache.org/jira/browse/HIVE-26900?focusedWorklogId=853760=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853760 ] ASF GitHub Bot logged work on HIVE-26900: - Author: ASF GitHub Bot Created on: 29/Mar/23 17:58 Start Date: 29/Mar/23 17:58 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #4171: URL: https://github.com/apache/hive/pull/4171#issuecomment-1489056490 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive=4171) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=4171=false=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=4171=false=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive=4171=false=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=4171=false=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=4171=false=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=4171=false=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4171=false=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4171=false=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4171=false=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive=4171=false=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=4171=false=CODE_SMELL) [2 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive=4171=false=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive=4171=coverage=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive=4171=duplicated_lines_density=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 853760) Time Spent: 1h 40m (was: 1.5h) > Error message not representing the correct line number with a syntax error in > a HQL File > > > Key: HIVE-26900 > URL: https://issues.apache.org/jira/browse/HIVE-26900 > Project: Hive > Issue Type: Bug >Affects Versions: 3.1.2, 4.0.0-alpha-1, 4.0.0-alpha-2 >Reporter: Vikram Ahuja >Priority: Minor > Labels: pull-request-available > Time Spent: 1h 40m > Remaining Estimate: 0h > > When a wrong syntax is added in a HQL file, the error thrown by beeline while > running the HQL file is having the wrong line number. The line number and > even the position is incorrect. Seems like parser is not considering spaces > and new lines and always throwing the error on line number 1 irrespective of > what line the error is on in the HQL file > > For instance, consider the following test.hql file: > # --comment > # --comment > # SET hive.server2.logging.operation.enabled=true; > # SET hive.server2.logging.operation.level=VERBOSE; > # show tables; > # > # > # CREATE TABLEE DUMMY; > > when we call !run test.hql in beeline or trigger ./beeline -u > jdbc:hive2://localhost:1 -f test.hql, The issue thrown by beeline is > >>> CREATE TABLEE DUMMY; > Error: Error while compiling statement: FAILED: ParseException line 1:7 > cannot recongize input near 'CREATE'
[jira] [Work logged] (HIVE-27135) AcidUtils#getHdfsDirSnapshots() throws FNFE when a directory is removed in HDFS
[ https://issues.apache.org/jira/browse/HIVE-27135?focusedWorklogId=853745=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853745 ] ASF GitHub Bot logged work on HIVE-27135: - Author: ASF GitHub Bot Created on: 29/Mar/23 17:03 Start Date: 29/Mar/23 17:03 Worklog Time Spent: 10m Work Description: mdayakar commented on code in PR #4114: URL: https://github.com/apache/hive/pull/4114#discussion_r1152246387 ## ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java: ## @@ -1538,32 +1538,36 @@ private static HdfsDirSnapshot addToSnapshot(Map dirToSna public static Map getHdfsDirSnapshots(final FileSystem fs, final Path path) throws IOException { Map dirToSnapshots = new HashMap<>(); -RemoteIterator itr = FileUtils.listFiles(fs, path, true, acidHiddenFileFilter); -while (itr.hasNext()) { - FileStatus fStatus = itr.next(); - Path fPath = fStatus.getPath(); - if (fStatus.isDirectory() && acidTempDirFilter.accept(fPath)) { -addToSnapshot(dirToSnapshots, fPath); - } else { -Path parentDirPath = fPath.getParent(); -if (acidTempDirFilter.accept(parentDirPath)) { - while (isChildOfDelta(parentDirPath, path)) { -// Some cases there are other directory layers between the delta and the datafiles -// (export-import mm table, insert with union all to mm table, skewed tables). -// But it does not matter for the AcidState, we just need the deltas and the data files -// So build the snapshot with the files inside the delta directory -parentDirPath = parentDirPath.getParent(); - } - HdfsDirSnapshot dirSnapshot = addToSnapshot(dirToSnapshots, parentDirPath); - // We're not filtering out the metadata file and acid format file, - // as they represent parts of a valid snapshot - // We're not using the cached values downstream, but we can potentially optimize more in a follow-up task - if (fStatus.getPath().toString().contains(MetaDataFile.METADATA_FILE)) { -dirSnapshot.addMetadataFile(fStatus); - } else if (fStatus.getPath().toString().contains(OrcAcidVersion.ACID_FORMAT)) { -dirSnapshot.addOrcAcidFormatFile(fStatus); - } else { -dirSnapshot.addFile(fStatus); +Deque> stack = new ArrayDeque<>(); +stack.push(FileUtils.listLocatedStatusIterator(fs, path, acidHiddenFileFilter)); +while (!stack.isEmpty()) { + RemoteIterator itr = stack.pop(); + while (itr.hasNext()) { +FileStatus fStatus = itr.next(); +Path fPath = fStatus.getPath(); +if (fStatus.isDirectory()) { + stack.push(FileUtils.listLocatedStatusIterator(fs, fPath, acidHiddenFileFilter)); Review Comment: Here it will not list empty directories, actually the above if condition is obsolete in old code. Tested with old and modified code, both don't add empty directories. Issue Time Tracking --- Worklog Id: (was: 853745) Time Spent: 6.5h (was: 6h 20m) > AcidUtils#getHdfsDirSnapshots() throws FNFE when a directory is removed in > HDFS > --- > > Key: HIVE-27135 > URL: https://issues.apache.org/jira/browse/HIVE-27135 > Project: Hive > Issue Type: Bug >Reporter: Dayakar M >Assignee: Dayakar M >Priority: Major > Labels: pull-request-available > Time Spent: 6.5h > Remaining Estimate: 0h > > AcidUtils#getHdfsDirSnapshots() throws FileNotFoundException when a directory > is removed in HDFS while fetching HDFS Snapshots. > Below testcode can be used to reproduce this issue. > {code:java} > @Test > public void > testShouldNotThrowFNFEWhenHiveStagingDirectoryIsRemovedWhileFetchingHDFSSnapshots() > throws Exception { > MockFileSystem fs = new MockFileSystem(new HiveConf(), > new MockFile("mock:/tbl/part1/.hive-staging_dir/-ext-10002", 500, new > byte[0]), > new MockFile("mock:/tbl/part2/.hive-staging_dir", 500, new byte[0]), > new MockFile("mock:/tbl/part1/_tmp_space.db", 500, new byte[0]), > new MockFile("mock:/tbl/part1/delta_1_1/bucket--", 500, new > byte[0])); > Path path = new MockPath(fs, "/tbl"); > Path stageDir = new MockPath(fs, "mock:/tbl/part1/.hive-staging_dir"); > FileSystem mockFs = spy(fs); > Mockito.doThrow(new > FileNotFoundException("")).when(mockFs).listLocatedStatus(eq(stageDir)); > try { > Map hdfsDirSnapshots = > AcidUtils.getHdfsDirSnapshots(mockFs, path); > Assert.assertEquals(1, hdfsDirSnapshots.size()); > } > catch (FileNotFoundException fnf) { > fail("Should not throw FileNotFoundException when a
[jira] [Commented] (HIVE-27193) Database names starting with '@' cause error during ALTER/DROP table.
[ https://issues.apache.org/jira/browse/HIVE-27193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17706513#comment-17706513 ] Oliver Schiller commented on HIVE-27193: After poking a while around in the code and trying to understand how everything plays together, I'm still uncertain where the catalog name should be prepended. I don't know whether the following assumptions/invariants are correct: * If the request object does not have its own catalog name field, the catalog name is prepended and sent to the metastore via the dbname. This happens always in class (Session)HiveMetaStoreClient. * If a complete request object is passed to HiveMetaStoreClient, which is then just passed along, the caller ensures that the catalog is prepended. If they are correct, going through HiveMetaStoreClient following these assumptions should ensure that a leading '@' does not cause issues since a catalog is always prepended (however, if the catalog name is allowed to contain #, another problem would emerge). > Database names starting with '@' cause error during ALTER/DROP table. > - > > Key: HIVE-27193 > URL: https://issues.apache.org/jira/browse/HIVE-27193 > Project: Hive > Issue Type: Bug > Components: Metastore, Standalone Metastore >Affects Versions: 4.0.0-alpha-2 >Reporter: Oliver Schiller >Priority: Major > > The creation of database that start with '@' is supported: > > {code:java} > create database `@test`;{code} > > The creation of a table in this database works: > > {code:java} > create table `@test`.testtable (c1 integer);{code} > However, dropping or altering the table result in an error: > > {code:java} > drop table `@test`.testtable; > FAILED: SemanticException Unable to fetch table testtable. @test is prepended > with the catalog marker but does not appear to have a catalog name in it > Error: Error while compiling statement: FAILED: SemanticException Unable to > fetch table testtable. @test is prepended with the catalog marker but does > not appear to have a catalog name in it (state=42000,code=4) > alter table `@test`.testtable add columns (c2 integer); > FAILED: SemanticException Unable to fetch table testtable. @test is prepended > with the catalog marker but does not appear to have a catalog name in it > Error: Error while compiling statement: FAILED: SemanticException Unable to > fetch table testtable. @test is prepended with the catalog marker but does > not appear to have a catalog name in it (state=42000,code=4) > {code} > > Relevant snippet of stack trace: > > {{}} > {code:java} > org.apache.hadoop.hive.metastore.api.MetaException: @TEST is prepended with > the catalog marker but does not appear to have a catalog name in it at > org.apache.hadoop.hive.metastore.utils.MetaStoreUtils.parseDbName(MetaStoreUtils.java:1031 > at > org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.getTempTable(SessionHiveMetaStoreClient.java:651) > at > org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.getTable(SessionHiveMetaStoreClient.java:279) > at > org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.getTable(SessionHiveMetaStoreClient.java:273) > at > org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.getTable(SessionHiveMetaStoreClient.java:258) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropTable(HiveMetaStoreClient.java:1982)org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropTable(HiveMetaStoreClient.java:1957) > ...{code} > {{}} > > My suspicion is that this caused by the implementation of getTempTable and > how it is called. The method getTempTable calls parseDbName assuming that the > given dbname might be prefixed with a catalog name. I'm wondering whether > this is correct at this layer. From poking a bit around, it appears to me > that the catalog name is typically prepended when making the actual thrift > call. > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26900) Error message not representing the correct line number with a syntax error in a HQL File
[ https://issues.apache.org/jira/browse/HIVE-26900?focusedWorklogId=853744=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853744 ] ASF GitHub Bot logged work on HIVE-26900: - Author: ASF GitHub Bot Created on: 29/Mar/23 17:00 Start Date: 29/Mar/23 17:00 Worklog Time Spent: 10m Work Description: shreeyasand opened a new pull request, #4171: URL: https://github.com/apache/hive/pull/4171 …th a syntax error in a HQL File ### What changes were proposed in this pull request? In the Beeline class: - A new method executeReader() has been introduced specifically to read hql files. It makes one string out of all the contents of the hql file separated by newline characters (the comments are excluded). In the Commands class: - Since handling multiple lines of query for hql files has already been addressed in the executeReader method, we limit the handleMultipleLineCmd() method to every other scenario besides when reading an hql file. In both Beeline.java and Commands.java: - Trimming of the string/sql has been removed while reading hql file contents. This is achieved whenever getOpts().getScriptFile() equals null (ie this is for every situation except when reading an hql file). This is done so that the whitespaces and empty lines are not ignored while counting the line numbers. ### Why are the changes needed? - Hive Cli throws error line number correctly when reading HQL files, but Beeline does not. These changes are needed so that the error line number is thrown correctly and there is no discrepancy between the functioning of Beeline and Hive Cli. ### Does this PR introduce _any_ user-facing change? - Error message in Beeline was not representing the correct line number prior to the changes. Now Beeline prints the correct error line number. ### How was this patch tested? - The testing was done locally on Beeline with multiple scenarios. The test were verified against the correctly functioning Hive Cli. - As an example, for the given hql file: https://user-images.githubusercontent.com/50237152/222977016-e8a72f33-2f47-4ad4-aeff-2afb6f4a3bc9.png;> Error message prior to the changes: https://user-images.githubusercontent.com/50237152/222977044-90f746ee-1958-4c6a-9627-c1c1e2a173cc.png;> Error message after the changes: https://user-images.githubusercontent.com/50237152/222977064-d19b6bb8-b2bc-4292-a24a-1a14d04ab3eb.png;> Issue Time Tracking --- Worklog Id: (was: 853744) Time Spent: 1.5h (was: 1h 20m) > Error message not representing the correct line number with a syntax error in > a HQL File > > > Key: HIVE-26900 > URL: https://issues.apache.org/jira/browse/HIVE-26900 > Project: Hive > Issue Type: Bug >Affects Versions: 3.1.2, 4.0.0-alpha-1, 4.0.0-alpha-2 >Reporter: Vikram Ahuja >Priority: Minor > Labels: pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > > When a wrong syntax is added in a HQL file, the error thrown by beeline while > running the HQL file is having the wrong line number. The line number and > even the position is incorrect. Seems like parser is not considering spaces > and new lines and always throwing the error on line number 1 irrespective of > what line the error is on in the HQL file > > For instance, consider the following test.hql file: > # --comment > # --comment > # SET hive.server2.logging.operation.enabled=true; > # SET hive.server2.logging.operation.level=VERBOSE; > # show tables; > # > # > # CREATE TABLEE DUMMY; > > when we call !run test.hql in beeline or trigger ./beeline -u > jdbc:hive2://localhost:1 -f test.hql, The issue thrown by beeline is > >>> CREATE TABLEE DUMMY; > Error: Error while compiling statement: FAILED: ParseException line 1:7 > cannot recongize input near 'CREATE' 'TABLEE' 'DUMMY' in ddl statement > (state=42000,code=4) > The parser seems to be taking all the lines from 1 and is ignoring spaces in > the line. > The error line in the parse exception is shown as 1:7 but it should have been > 8:13. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-27135) AcidUtils#getHdfsDirSnapshots() throws FNFE when a directory is removed in HDFS
[ https://issues.apache.org/jira/browse/HIVE-27135?focusedWorklogId=853743=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853743 ] ASF GitHub Bot logged work on HIVE-27135: - Author: ASF GitHub Bot Created on: 29/Mar/23 16:59 Start Date: 29/Mar/23 16:59 Worklog Time Spent: 10m Work Description: mdayakar commented on code in PR #4114: URL: https://github.com/apache/hive/pull/4114#discussion_r1152241772 ## common/src/java/org/apache/hadoop/hive/common/FileUtils.java: ## @@ -1376,6 +1376,12 @@ public static RemoteIterator listStatusIterator(FileSystem fs, Path status -> filter.accept(status.getPath())); } + public static RemoteIterator listLocatedStatusIterator(FileSystem fs, Path path, PathFilter filter) Review Comment: Changed code to use FileStatus object and using existing _org.apache.hadoop.hive.common.FileUtils#listStatusIterator()_ API which is used in _org.apache.hadoop.hive.ql.io.AcidUtils#getHdfsDirSnapshotsForCleaner()_ API. Issue Time Tracking --- Worklog Id: (was: 853743) Time Spent: 6h 20m (was: 6h 10m) > AcidUtils#getHdfsDirSnapshots() throws FNFE when a directory is removed in > HDFS > --- > > Key: HIVE-27135 > URL: https://issues.apache.org/jira/browse/HIVE-27135 > Project: Hive > Issue Type: Bug >Reporter: Dayakar M >Assignee: Dayakar M >Priority: Major > Labels: pull-request-available > Time Spent: 6h 20m > Remaining Estimate: 0h > > AcidUtils#getHdfsDirSnapshots() throws FileNotFoundException when a directory > is removed in HDFS while fetching HDFS Snapshots. > Below testcode can be used to reproduce this issue. > {code:java} > @Test > public void > testShouldNotThrowFNFEWhenHiveStagingDirectoryIsRemovedWhileFetchingHDFSSnapshots() > throws Exception { > MockFileSystem fs = new MockFileSystem(new HiveConf(), > new MockFile("mock:/tbl/part1/.hive-staging_dir/-ext-10002", 500, new > byte[0]), > new MockFile("mock:/tbl/part2/.hive-staging_dir", 500, new byte[0]), > new MockFile("mock:/tbl/part1/_tmp_space.db", 500, new byte[0]), > new MockFile("mock:/tbl/part1/delta_1_1/bucket--", 500, new > byte[0])); > Path path = new MockPath(fs, "/tbl"); > Path stageDir = new MockPath(fs, "mock:/tbl/part1/.hive-staging_dir"); > FileSystem mockFs = spy(fs); > Mockito.doThrow(new > FileNotFoundException("")).when(mockFs).listLocatedStatus(eq(stageDir)); > try { > Map hdfsDirSnapshots = > AcidUtils.getHdfsDirSnapshots(mockFs, path); > Assert.assertEquals(1, hdfsDirSnapshots.size()); > } > catch (FileNotFoundException fnf) { > fail("Should not throw FileNotFoundException when a directory is > removed while fetching HDFSSnapshots"); > } > }{code} > This issue got fixed as a part of HIVE-26481 but here its not fixed > completely. > [Here|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java#L1541] > FileUtils.listFiles() API which returns a RemoteIterator. > So while iterating over, it checks if it is a directory and recursive listing > then it will try to list files from that directory but if that directory is > removed by other thread/task then it throws FileNotFoundException. Here the > directory which got removed is the .staging directory which needs to be > excluded through by using passed filter. > > So here we can use same logic written in > _org.apache.hadoop.hive.ql.io.AcidUtils#getHdfsDirSnapshotsForCleaner()_ API > to avoid FileNotFoundException. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26900) Error message not representing the correct line number with a syntax error in a HQL File
[ https://issues.apache.org/jira/browse/HIVE-26900?focusedWorklogId=853741=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853741 ] ASF GitHub Bot logged work on HIVE-26900: - Author: ASF GitHub Bot Created on: 29/Mar/23 16:58 Start Date: 29/Mar/23 16:58 Worklog Time Spent: 10m Work Description: shreeyasand closed pull request #4168: HIVE-26900: Error message not representing the correct line number wi… URL: https://github.com/apache/hive/pull/4168 Issue Time Tracking --- Worklog Id: (was: 853741) Time Spent: 1h 20m (was: 1h 10m) > Error message not representing the correct line number with a syntax error in > a HQL File > > > Key: HIVE-26900 > URL: https://issues.apache.org/jira/browse/HIVE-26900 > Project: Hive > Issue Type: Bug >Affects Versions: 3.1.2, 4.0.0-alpha-1, 4.0.0-alpha-2 >Reporter: Vikram Ahuja >Priority: Minor > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > When a wrong syntax is added in a HQL file, the error thrown by beeline while > running the HQL file is having the wrong line number. The line number and > even the position is incorrect. Seems like parser is not considering spaces > and new lines and always throwing the error on line number 1 irrespective of > what line the error is on in the HQL file > > For instance, consider the following test.hql file: > # --comment > # --comment > # SET hive.server2.logging.operation.enabled=true; > # SET hive.server2.logging.operation.level=VERBOSE; > # show tables; > # > # > # CREATE TABLEE DUMMY; > > when we call !run test.hql in beeline or trigger ./beeline -u > jdbc:hive2://localhost:1 -f test.hql, The issue thrown by beeline is > >>> CREATE TABLEE DUMMY; > Error: Error while compiling statement: FAILED: ParseException line 1:7 > cannot recongize input near 'CREATE' 'TABLEE' 'DUMMY' in ddl statement > (state=42000,code=4) > The parser seems to be taking all the lines from 1 and is ignoring spaces in > the line. > The error line in the parse exception is shown as 1:7 but it should have been > 8:13. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-27135) AcidUtils#getHdfsDirSnapshots() throws FNFE when a directory is removed in HDFS
[ https://issues.apache.org/jira/browse/HIVE-27135?focusedWorklogId=853739=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853739 ] ASF GitHub Bot logged work on HIVE-27135: - Author: ASF GitHub Bot Created on: 29/Mar/23 16:57 Start Date: 29/Mar/23 16:57 Worklog Time Spent: 10m Work Description: mdayakar commented on code in PR #4114: URL: https://github.com/apache/hive/pull/4114#discussion_r1152239566 ## ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java: ## @@ -1538,32 +1538,36 @@ private static HdfsDirSnapshot addToSnapshot(Map dirToSna public static Map getHdfsDirSnapshots(final FileSystem fs, final Path path) throws IOException { Map dirToSnapshots = new HashMap<>(); -RemoteIterator itr = FileUtils.listFiles(fs, path, true, acidHiddenFileFilter); -while (itr.hasNext()) { - FileStatus fStatus = itr.next(); - Path fPath = fStatus.getPath(); - if (fStatus.isDirectory() && acidTempDirFilter.accept(fPath)) { -addToSnapshot(dirToSnapshots, fPath); - } else { -Path parentDirPath = fPath.getParent(); -if (acidTempDirFilter.accept(parentDirPath)) { - while (isChildOfDelta(parentDirPath, path)) { -// Some cases there are other directory layers between the delta and the datafiles -// (export-import mm table, insert with union all to mm table, skewed tables). -// But it does not matter for the AcidState, we just need the deltas and the data files -// So build the snapshot with the files inside the delta directory -parentDirPath = parentDirPath.getParent(); - } - HdfsDirSnapshot dirSnapshot = addToSnapshot(dirToSnapshots, parentDirPath); - // We're not filtering out the metadata file and acid format file, - // as they represent parts of a valid snapshot - // We're not using the cached values downstream, but we can potentially optimize more in a follow-up task - if (fStatus.getPath().toString().contains(MetaDataFile.METADATA_FILE)) { -dirSnapshot.addMetadataFile(fStatus); - } else if (fStatus.getPath().toString().contains(OrcAcidVersion.ACID_FORMAT)) { -dirSnapshot.addOrcAcidFormatFile(fStatus); - } else { -dirSnapshot.addFile(fStatus); +Deque> stack = new ArrayDeque<>(); Review Comment: Updated accordingly. Issue Time Tracking --- Worklog Id: (was: 853739) Time Spent: 6h 10m (was: 6h) > AcidUtils#getHdfsDirSnapshots() throws FNFE when a directory is removed in > HDFS > --- > > Key: HIVE-27135 > URL: https://issues.apache.org/jira/browse/HIVE-27135 > Project: Hive > Issue Type: Bug >Reporter: Dayakar M >Assignee: Dayakar M >Priority: Major > Labels: pull-request-available > Time Spent: 6h 10m > Remaining Estimate: 0h > > AcidUtils#getHdfsDirSnapshots() throws FileNotFoundException when a directory > is removed in HDFS while fetching HDFS Snapshots. > Below testcode can be used to reproduce this issue. > {code:java} > @Test > public void > testShouldNotThrowFNFEWhenHiveStagingDirectoryIsRemovedWhileFetchingHDFSSnapshots() > throws Exception { > MockFileSystem fs = new MockFileSystem(new HiveConf(), > new MockFile("mock:/tbl/part1/.hive-staging_dir/-ext-10002", 500, new > byte[0]), > new MockFile("mock:/tbl/part2/.hive-staging_dir", 500, new byte[0]), > new MockFile("mock:/tbl/part1/_tmp_space.db", 500, new byte[0]), > new MockFile("mock:/tbl/part1/delta_1_1/bucket--", 500, new > byte[0])); > Path path = new MockPath(fs, "/tbl"); > Path stageDir = new MockPath(fs, "mock:/tbl/part1/.hive-staging_dir"); > FileSystem mockFs = spy(fs); > Mockito.doThrow(new > FileNotFoundException("")).when(mockFs).listLocatedStatus(eq(stageDir)); > try { > Map hdfsDirSnapshots = > AcidUtils.getHdfsDirSnapshots(mockFs, path); > Assert.assertEquals(1, hdfsDirSnapshots.size()); > } > catch (FileNotFoundException fnf) { > fail("Should not throw FileNotFoundException when a directory is > removed while fetching HDFSSnapshots"); > } > }{code} > This issue got fixed as a part of HIVE-26481 but here its not fixed > completely. > [Here|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java#L1541] > FileUtils.listFiles() API which returns a RemoteIterator. > So while iterating over, it checks if it is a directory and recursive listing > then it will try to list files from that directory but if that directory is > removed by other thread/task then it throws
[jira] [Work logged] (HIVE-26400) Provide docker images for Hive
[ https://issues.apache.org/jira/browse/HIVE-26400?focusedWorklogId=853736=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853736 ] ASF GitHub Bot logged work on HIVE-26400: - Author: ASF GitHub Bot Created on: 29/Mar/23 16:41 Start Date: 29/Mar/23 16:41 Worklog Time Spent: 10m Work Description: TuroczyX commented on PR #3448: URL: https://github.com/apache/hive/pull/3448#issuecomment-1488941527 Seems like the build is broken. @deniskuzZ Could you please re-start? Issue Time Tracking --- Worklog Id: (was: 853736) Time Spent: 11h 40m (was: 11.5h) > Provide docker images for Hive > -- > > Key: HIVE-26400 > URL: https://issues.apache.org/jira/browse/HIVE-26400 > Project: Hive > Issue Type: Sub-task > Components: Build Infrastructure >Reporter: Zhihua Deng >Assignee: Zhihua Deng >Priority: Blocker > Labels: hive-4.0.0-must, pull-request-available > Time Spent: 11h 40m > Remaining Estimate: 0h > > Make Apache Hive be able to run inside docker container in pseudo-distributed > mode, with MySQL/Derby as its back database, provide the following: > * Quick-start/Debugging/Prepare a test env for Hive; > * Tools to build target image with specified version of Hive and its > dependencies; > * Images can be used as the basis for the Kubernetes operator. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26400) Provide docker images for Hive
[ https://issues.apache.org/jira/browse/HIVE-26400?focusedWorklogId=853735=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853735 ] ASF GitHub Bot logged work on HIVE-26400: - Author: ASF GitHub Bot Created on: 29/Mar/23 16:37 Start Date: 29/Mar/23 16:37 Worklog Time Spent: 10m Work Description: TuroczyX commented on PR #3448: URL: https://github.com/apache/hive/pull/3448#issuecomment-1488936926 > > > Should be included in this initiative also create an docker image for the hive metastore standalone? > > > Something like this: https://techjogging.com/standalone-hive-metastore-presto-docker.html > > > https://github.com/arempter/hive-metastore-docker/blob/master/Dockerfile > > > https://github.com/aws-samples/hive-emr-on-eks/blob/main/docker/Dockerfile > > > Thanks > > > > > > It is a good point. @dengzhhu653 @deniskuzZ @ayushtkn @abstractdog What do you think about it? > > The image can serve both HS2 and Metastore, as you can see in the README: https://github.com/apache/hive/pull/3448/files#diff-75345b4702a737ff955983bea3daeac9243e26ef1d2dc0398a31ef28380da9cb. Separating them needs another build, makes it a bit hard to maintain in the public repo. Understandable. Issue Time Tracking --- Worklog Id: (was: 853735) Time Spent: 11.5h (was: 11h 20m) > Provide docker images for Hive > -- > > Key: HIVE-26400 > URL: https://issues.apache.org/jira/browse/HIVE-26400 > Project: Hive > Issue Type: Sub-task > Components: Build Infrastructure >Reporter: Zhihua Deng >Assignee: Zhihua Deng >Priority: Blocker > Labels: hive-4.0.0-must, pull-request-available > Time Spent: 11.5h > Remaining Estimate: 0h > > Make Apache Hive be able to run inside docker container in pseudo-distributed > mode, with MySQL/Derby as its back database, provide the following: > * Quick-start/Debugging/Prepare a test env for Hive; > * Tools to build target image with specified version of Hive and its > dependencies; > * Images can be used as the basis for the Kubernetes operator. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HIVE-27193) Database names starting with '@' cause error during ALTER/DROP table.
[ https://issues.apache.org/jira/browse/HIVE-27193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17706501#comment-17706501 ] Oliver Schiller commented on HIVE-27193: The code in getTempTable was added in [https://github.com/apache/hive/pull/3072]. It seems that it was added to deal with changes mades in getPartitionsByNames. > Database names starting with '@' cause error during ALTER/DROP table. > - > > Key: HIVE-27193 > URL: https://issues.apache.org/jira/browse/HIVE-27193 > Project: Hive > Issue Type: Bug > Components: Metastore, Standalone Metastore >Affects Versions: 4.0.0-alpha-2 >Reporter: Oliver Schiller >Priority: Major > > The creation of database that start with '@' is supported: > > {code:java} > create database `@test`;{code} > > The creation of a table in this database works: > > {code:java} > create table `@test`.testtable (c1 integer);{code} > However, dropping or altering the table result in an error: > > {code:java} > drop table `@test`.testtable; > FAILED: SemanticException Unable to fetch table testtable. @test is prepended > with the catalog marker but does not appear to have a catalog name in it > Error: Error while compiling statement: FAILED: SemanticException Unable to > fetch table testtable. @test is prepended with the catalog marker but does > not appear to have a catalog name in it (state=42000,code=4) > alter table `@test`.testtable add columns (c2 integer); > FAILED: SemanticException Unable to fetch table testtable. @test is prepended > with the catalog marker but does not appear to have a catalog name in it > Error: Error while compiling statement: FAILED: SemanticException Unable to > fetch table testtable. @test is prepended with the catalog marker but does > not appear to have a catalog name in it (state=42000,code=4) > {code} > > Relevant snippet of stack trace: > > {{}} > {code:java} > org.apache.hadoop.hive.metastore.api.MetaException: @TEST is prepended with > the catalog marker but does not appear to have a catalog name in it at > org.apache.hadoop.hive.metastore.utils.MetaStoreUtils.parseDbName(MetaStoreUtils.java:1031 > at > org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.getTempTable(SessionHiveMetaStoreClient.java:651) > at > org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.getTable(SessionHiveMetaStoreClient.java:279) > at > org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.getTable(SessionHiveMetaStoreClient.java:273) > at > org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.getTable(SessionHiveMetaStoreClient.java:258) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropTable(HiveMetaStoreClient.java:1982)org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropTable(HiveMetaStoreClient.java:1957) > ...{code} > {{}} > > My suspicion is that this caused by the implementation of getTempTable and > how it is called. The method getTempTable calls parseDbName assuming that the > given dbname might be prefixed with a catalog name. I'm wondering whether > this is correct at this layer. From poking a bit around, it appears to me > that the catalog name is typically prepended when making the actual thrift > call. > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26809) Upgrade ORC to 1.8.3
[ https://issues.apache.org/jira/browse/HIVE-26809?focusedWorklogId=853704=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853704 ] ASF GitHub Bot logged work on HIVE-26809: - Author: ASF GitHub Bot Created on: 29/Mar/23 14:20 Start Date: 29/Mar/23 14:20 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #4121: URL: https://github.com/apache/hive/pull/4121#issuecomment-1488718775 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive=4121) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=4121=false=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=4121=false=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive=4121=false=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=4121=false=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=4121=false=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=4121=false=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4121=false=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4121=false=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4121=false=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive=4121=false=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=4121=false=CODE_SMELL) [2 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive=4121=false=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive=4121=coverage=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive=4121=duplicated_lines_density=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 853704) Time Spent: 8h 40m (was: 8.5h) > Upgrade ORC to 1.8.3 > > > Key: HIVE-26809 > URL: https://issues.apache.org/jira/browse/HIVE-26809 > Project: Hive > Issue Type: Improvement >Affects Versions: 4.0.0 >Reporter: Dmitriy Fingerman >Assignee: Zoltán Rátkai >Priority: Major > Labels: pull-request-available > Time Spent: 8h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-27192) Use normal import instead of shaded import in TestSchemaToolCatalogOps.java
[ https://issues.apache.org/jira/browse/HIVE-27192?focusedWorklogId=853701=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853701 ] ASF GitHub Bot logged work on HIVE-27192: - Author: ASF GitHub Bot Created on: 29/Mar/23 14:13 Start Date: 29/Mar/23 14:13 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #4169: URL: https://github.com/apache/hive/pull/4169#issuecomment-1488706465 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive=4169) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=4169=false=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=4169=false=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive=4169=false=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=4169=false=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=4169=false=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=4169=false=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4169=false=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4169=false=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4169=false=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive=4169=false=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=4169=false=CODE_SMELL) [0 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive=4169=false=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive=4169=coverage=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive=4169=duplicated_lines_density=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 853701) Time Spent: 20m (was: 10m) > Use normal import instead of shaded import in TestSchemaToolCatalogOps.java > --- > > Key: HIVE-27192 > URL: https://issues.apache.org/jira/browse/HIVE-27192 > Project: Hive > Issue Type: Improvement >Reporter: Zoltán Rátkai >Assignee: Zoltán Rátkai >Priority: Minor > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-27192) Use normal import instead of shaded import in TestSchemaToolCatalogOps.java
[ https://issues.apache.org/jira/browse/HIVE-27192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-27192: -- Labels: pull-request-available (was: ) > Use normal import instead of shaded import in TestSchemaToolCatalogOps.java > --- > > Key: HIVE-27192 > URL: https://issues.apache.org/jira/browse/HIVE-27192 > Project: Hive > Issue Type: Improvement >Reporter: Zoltán Rátkai >Assignee: Zoltán Rátkai >Priority: Minor > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-27192) Use normal import instead of shaded import in TestSchemaToolCatalogOps.java
[ https://issues.apache.org/jira/browse/HIVE-27192?focusedWorklogId=853680=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853680 ] ASF GitHub Bot logged work on HIVE-27192: - Author: ASF GitHub Bot Created on: 29/Mar/23 13:10 Start Date: 29/Mar/23 13:10 Worklog Time Spent: 10m Work Description: zratkai opened a new pull request, #4169: URL: https://github.com/apache/hive/pull/4169 …olCatalogOps.java ### What changes were proposed in this pull request? Eliminated the shaded usage of com.google.common.io.Files; ### Why are the changes needed? We do not need shaded usage here. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Jenkins test. Issue Time Tracking --- Worklog Id: (was: 853680) Remaining Estimate: 0h Time Spent: 10m > Use normal import instead of shaded import in TestSchemaToolCatalogOps.java > --- > > Key: HIVE-27192 > URL: https://issues.apache.org/jira/browse/HIVE-27192 > Project: Hive > Issue Type: Improvement >Reporter: Zoltán Rátkai >Assignee: Zoltán Rátkai >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26900) Error message not representing the correct line number with a syntax error in a HQL File
[ https://issues.apache.org/jira/browse/HIVE-26900?focusedWorklogId=853678=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853678 ] ASF GitHub Bot logged work on HIVE-26900: - Author: ASF GitHub Bot Created on: 29/Mar/23 13:01 Start Date: 29/Mar/23 13:01 Worklog Time Spent: 10m Work Description: shreeyasand opened a new pull request, #4168: URL: https://github.com/apache/hive/pull/4168 …th a syntax error in a HQL File ### What changes were proposed in this pull request? In the Beeline class: - A new method executeReader() has been introduced specifically to read hql files. It makes one string out of all the contents of the hql file separated by newline characters (the comments are excluded). In the Commands class: - Since handling multiple lines of query for hql files has already been addressed in the executeReader method, we limit the handleMultipleLineCmd() method to every other scenario besides when reading an hql file. In both Beeline.java and Commands.java: Trimming of the string/sql has been removed while reading hql file contents. This is achieved whenever getOpts().getScriptFile() equals null (ie this is for every situation except when reading an hql file). This is done so that the whitespaces and empty lines are not ignored while counting the line numbers. ### Why are the changes needed? Hive Cli throws error line number correctly when reading HQL files, but Beeline does not. These changes are needed so that the error line number is thrown correctly and there is no discrepancy between the functioning of Beeline and Hive Cli. ### Does this PR introduce _any_ user-facing change? Error message in Beeline was not representing the correct line number prior to the changes. Now Beeline prints the correct error line number. ### How was this patch tested? The testing was done locally on Beeline with multiple scenarios. The test were verified against the correctly functioning Hive Cli. As an example, for the given hql file: https://user-images.githubusercontent.com/50237152/222977016-e8a72f33-2f47-4ad4-aeff-2afb6f4a3bc9.png;> Error message prior to the changes: https://user-images.githubusercontent.com/50237152/222977044-90f746ee-1958-4c6a-9627-c1c1e2a173cc.png;> Error message after the changes: https://user-images.githubusercontent.com/50237152/222977064-d19b6bb8-b2bc-4292-a24a-1a14d04ab3eb.png;> Issue Time Tracking --- Worklog Id: (was: 853678) Time Spent: 1h 10m (was: 1h) > Error message not representing the correct line number with a syntax error in > a HQL File > > > Key: HIVE-26900 > URL: https://issues.apache.org/jira/browse/HIVE-26900 > Project: Hive > Issue Type: Bug >Affects Versions: 3.1.2, 4.0.0-alpha-1, 4.0.0-alpha-2 >Reporter: Vikram Ahuja >Priority: Minor > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > When a wrong syntax is added in a HQL file, the error thrown by beeline while > running the HQL file is having the wrong line number. The line number and > even the position is incorrect. Seems like parser is not considering spaces > and new lines and always throwing the error on line number 1 irrespective of > what line the error is on in the HQL file > > For instance, consider the following test.hql file: > # --comment > # --comment > # SET hive.server2.logging.operation.enabled=true; > # SET hive.server2.logging.operation.level=VERBOSE; > # show tables; > # > # > # CREATE TABLEE DUMMY; > > when we call !run test.hql in beeline or trigger ./beeline -u > jdbc:hive2://localhost:1 -f test.hql, The issue thrown by beeline is > >>> CREATE TABLEE DUMMY; > Error: Error while compiling statement: FAILED: ParseException line 1:7 > cannot recongize input near 'CREATE' 'TABLEE' 'DUMMY' in ddl statement > (state=42000,code=4) > The parser seems to be taking all the lines from 1 and is ignoring spaces in > the line. > The error line in the parse exception is shown as 1:7 but it should have been > 8:13. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HIVE-27192) Use normal import instead of shaded import in TestSchemaToolCatalogOps.java
[ https://issues.apache.org/jira/browse/HIVE-27192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltán Rátkai reassigned HIVE-27192: > Use normal import instead of shaded import in TestSchemaToolCatalogOps.java > --- > > Key: HIVE-27192 > URL: https://issues.apache.org/jira/browse/HIVE-27192 > Project: Hive > Issue Type: Improvement >Reporter: Zoltán Rátkai >Assignee: Zoltán Rátkai >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26900) Error message not representing the correct line number with a syntax error in a HQL File
[ https://issues.apache.org/jira/browse/HIVE-26900?focusedWorklogId=853676=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853676 ] ASF GitHub Bot logged work on HIVE-26900: - Author: ASF GitHub Bot Created on: 29/Mar/23 12:40 Start Date: 29/Mar/23 12:40 Worklog Time Spent: 10m Work Description: shreeyasand closed pull request #4097: HIVE-26900: Error message not representing the correct line number wi… URL: https://github.com/apache/hive/pull/4097 Issue Time Tracking --- Worklog Id: (was: 853676) Time Spent: 1h (was: 50m) > Error message not representing the correct line number with a syntax error in > a HQL File > > > Key: HIVE-26900 > URL: https://issues.apache.org/jira/browse/HIVE-26900 > Project: Hive > Issue Type: Bug >Affects Versions: 3.1.2, 4.0.0-alpha-1, 4.0.0-alpha-2 >Reporter: Vikram Ahuja >Priority: Minor > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > When a wrong syntax is added in a HQL file, the error thrown by beeline while > running the HQL file is having the wrong line number. The line number and > even the position is incorrect. Seems like parser is not considering spaces > and new lines and always throwing the error on line number 1 irrespective of > what line the error is on in the HQL file > > For instance, consider the following test.hql file: > # --comment > # --comment > # SET hive.server2.logging.operation.enabled=true; > # SET hive.server2.logging.operation.level=VERBOSE; > # show tables; > # > # > # CREATE TABLEE DUMMY; > > when we call !run test.hql in beeline or trigger ./beeline -u > jdbc:hive2://localhost:1 -f test.hql, The issue thrown by beeline is > >>> CREATE TABLEE DUMMY; > Error: Error while compiling statement: FAILED: ParseException line 1:7 > cannot recongize input near 'CREATE' 'TABLEE' 'DUMMY' in ddl statement > (state=42000,code=4) > The parser seems to be taking all the lines from 1 and is ignoring spaces in > the line. > The error line in the parse exception is shown as 1:7 but it should have been > 8:13. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26900) Error message not representing the correct line number with a syntax error in a HQL File
[ https://issues.apache.org/jira/browse/HIVE-26900?focusedWorklogId=853672=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853672 ] ASF GitHub Bot logged work on HIVE-26900: - Author: ASF GitHub Bot Created on: 29/Mar/23 12:38 Start Date: 29/Mar/23 12:38 Worklog Time Spent: 10m Work Description: shreeyasand opened a new pull request, #4097: URL: https://github.com/apache/hive/pull/4097 …th a syntax error in a HQL File ### What changes were proposed in this pull request? In the Beeline class: - the execute method (at line 1362) has been modified to make one string out of all the contents of the hql file separated by newline characters (the comments are excluded). - if the final string is null, the code exits the while loop (it implies that there is no command to be executed). In both the classes (Beeline and Commands), the trim() method has been removed from a few places. This is done so that the whitespaces and empty lines are not ignored while counting the line numbers. ### Why are the changes needed? Hive Cli throws error line number correctly when reading HQL files, but Beeline does not. These changes are needed so that the error line number is thrown correctly and there is no discrepancy between the functioning of Beeline and Hive Cli. ### Does this PR introduce _any_ user-facing change? Error message in Beeline was not representing the correct line number prior to the changes. Now Beeline prints the correct error line number. ### How was this patch tested? The testing was done locally on Beeline with multiple scenarios. The test were verified against the correctly functioning Hive Cli. As an example, for the given hql file: https://user-images.githubusercontent.com/50237152/222977016-e8a72f33-2f47-4ad4-aeff-2afb6f4a3bc9.png;> Error message prior to the changes: https://user-images.githubusercontent.com/50237152/222977044-90f746ee-1958-4c6a-9627-c1c1e2a173cc.png;> Error message after the changes: https://user-images.githubusercontent.com/50237152/222977064-d19b6bb8-b2bc-4292-a24a-1a14d04ab3eb.png;> Issue Time Tracking --- Worklog Id: (was: 853672) Time Spent: 50m (was: 40m) > Error message not representing the correct line number with a syntax error in > a HQL File > > > Key: HIVE-26900 > URL: https://issues.apache.org/jira/browse/HIVE-26900 > Project: Hive > Issue Type: Bug >Affects Versions: 3.1.2, 4.0.0-alpha-1, 4.0.0-alpha-2 >Reporter: Vikram Ahuja >Priority: Minor > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > When a wrong syntax is added in a HQL file, the error thrown by beeline while > running the HQL file is having the wrong line number. The line number and > even the position is incorrect. Seems like parser is not considering spaces > and new lines and always throwing the error on line number 1 irrespective of > what line the error is on in the HQL file > > For instance, consider the following test.hql file: > # --comment > # --comment > # SET hive.server2.logging.operation.enabled=true; > # SET hive.server2.logging.operation.level=VERBOSE; > # show tables; > # > # > # CREATE TABLEE DUMMY; > > when we call !run test.hql in beeline or trigger ./beeline -u > jdbc:hive2://localhost:1 -f test.hql, The issue thrown by beeline is > >>> CREATE TABLEE DUMMY; > Error: Error while compiling statement: FAILED: ParseException line 1:7 > cannot recongize input near 'CREATE' 'TABLEE' 'DUMMY' in ddl statement > (state=42000,code=4) > The parser seems to be taking all the lines from 1 and is ignoring spaces in > the line. > The error line in the parse exception is shown as 1:7 but it should have been > 8:13. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-27187) Incremental rebuild of materialized view having aggregate and stored by iceberg
[ https://issues.apache.org/jira/browse/HIVE-27187?focusedWorklogId=853666=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853666 ] ASF GitHub Bot logged work on HIVE-27187: - Author: ASF GitHub Bot Created on: 29/Mar/23 12:01 Start Date: 29/Mar/23 12:01 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #4166: URL: https://github.com/apache/hive/pull/4166#issuecomment-1488471135 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive=4166) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=4166=false=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=4166=false=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive=4166=false=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=4166=false=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=4166=false=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=4166=false=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4166=false=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4166=false=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4166=false=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive=4166=false=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=4166=false=CODE_SMELL) [4 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive=4166=false=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive=4166=coverage=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive=4166=duplicated_lines_density=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 853666) Time Spent: 0.5h (was: 20m) > Incremental rebuild of materialized view having aggregate and stored by > iceberg > --- > > Key: HIVE-27187 > URL: https://issues.apache.org/jira/browse/HIVE-27187 > Project: Hive > Issue Type: Improvement > Components: Iceberg integration, Materialized views >Reporter: Krisztian Kasa >Assignee: Krisztian Kasa >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Currently incremental rebuild of materialized view stored by iceberg which > definition query contains aggregate operator is transformed to an insert > overwrite statement which contains a union operator if the source tables > contains insert operations only. One branch of the union scans the view the > other produces the delta. > This can be improved further: transform the statement to a multi insert > statement representing a merge statement to insert new aggregations and > update existing. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-27189) Remove duplicate info log in Hive.isSubDIr
[ https://issues.apache.org/jira/browse/HIVE-27189?focusedWorklogId=853654=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853654 ] ASF GitHub Bot logged work on HIVE-27189: - Author: ASF GitHub Bot Created on: 29/Mar/23 10:56 Start Date: 29/Mar/23 10:56 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #4167: URL: https://github.com/apache/hive/pull/4167#issuecomment-1488378213 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive=4167) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=4167=false=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=4167=false=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive=4167=false=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=4167=false=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=4167=false=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=4167=false=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4167=false=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4167=false=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4167=false=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive=4167=false=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=4167=false=CODE_SMELL) [1 Code Smell](https://sonarcloud.io/project/issues?id=apache_hive=4167=false=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive=4167=coverage=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive=4167=duplicated_lines_density=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 853654) Time Spent: 20m (was: 10m) > Remove duplicate info log in Hive.isSubDIr > -- > > Key: HIVE-27189 > URL: https://issues.apache.org/jira/browse/HIVE-27189 > Project: Hive > Issue Type: Improvement >Reporter: shuyouZZ >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > In class {{{}org.apache.hadoop.hive.ql.metadata.HIve{}}}, invoke method > {{isSubDir}} will print twice > {code:java} > LOG.debug("The source path is " + fullF1 + " and the destination path is " + > fullF2);{code} > we should remove the duplicate info log. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-27189) Remove duplicate info log in Hive.isSubDIr
[ https://issues.apache.org/jira/browse/HIVE-27189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-27189: -- Labels: pull-request-available (was: ) > Remove duplicate info log in Hive.isSubDIr > -- > > Key: HIVE-27189 > URL: https://issues.apache.org/jira/browse/HIVE-27189 > Project: Hive > Issue Type: Improvement >Reporter: shuyouZZ >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > In class {{{}org.apache.hadoop.hive.ql.metadata.HIve{}}}, invoke method > {{isSubDir}} will print twice > {code:java} > LOG.debug("The source path is " + fullF1 + " and the destination path is " + > fullF2);{code} > we should remove the duplicate info log. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HIVE-27191) Cleaner is blocked by orphaned entries in MHL table
[ https://issues.apache.org/jira/browse/HIVE-27191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simhadri Govindappa reassigned HIVE-27191: -- > Cleaner is blocked by orphaned entries in MHL table > --- > > Key: HIVE-27191 > URL: https://issues.apache.org/jira/browse/HIVE-27191 > Project: Hive > Issue Type: Improvement >Reporter: Simhadri Govindappa >Assignee: Simhadri Govindappa >Priority: Major > > The following mhl_txnids do not exist in TXNS table, as a result, the > cleaner gets blocked and many entries are stuck in the ready-for-cleaning > state. > The cleaner should periodically check for such entries and remove them from > MHL_TABLE to prevent the cleaner from being blocked. > {noformat} > postgres=# select mhl_txnid from min_history_level where not exists (select 1 > from txns where txn_id = mhl_txnid); > mhl_txnid > --- > 43708080 > 43708088 > 43679962 > 43680464 > 43680352 > 43680392 > 43680424 > 43680436 > 43680471 > 43680475 > 43680483 > 43622677 > 43708083 > 43708084 > 43678157 > 43680482 > 43680484 > 43622745 > 43622750 > 43706829 > 43707261 > (21 rows){noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-27189) Remove duplicate info log in Hive.isSubDIr
[ https://issues.apache.org/jira/browse/HIVE-27189?focusedWorklogId=853644=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853644 ] ASF GitHub Bot logged work on HIVE-27189: - Author: ASF GitHub Bot Created on: 29/Mar/23 10:10 Start Date: 29/Mar/23 10:10 Worklog Time Spent: 10m Work Description: shuyouZZ opened a new pull request, #4167: URL: https://github.com/apache/hive/pull/4167 ### What changes were proposed in this pull request? Remove duplicate info log in Hive.isSubDIr ### Why are the changes needed? In class org.apache.hadoop.hive.ql.metadata.HIve, invoke method isSubDir will print twice `LOG.debug("The source path is " + fullF1 + " and the destination path is " + fullF2);` we should remove the duplicate info log. Below is a example in log file, `23/03/27 05:11:08 INFO Hive: New loading path = hdfs://R2/projects/sellerdata_ods/hive/sellerdata_ods/ods_shopee_order_detail_v8_di_ab_test/.hive-staging_hive_2023-03-27_05-09-17_848_8941157515106120269-1/-ext-1/dt=20230327/country=VN with partSpec {dt=20230327, country=VN} 23/03/27 05:11:08 DEBUG Hive: The source path is /projects/sellerdata_ods/hive/sellerdata_ods/ods_shopee_order_detail_v8_di_ab_test/.hive-staging_hive_2023-03-27_05-09-17_848_8941157515106120269-1/-ext-1/dt=20230327/country=SG/part-00179-51723c84-5be6-428e-862e-c4716e9536cc.c000/ and the destination path is /projects/sellerdata_ods/hive/sellerdata_ods/ods_shopee_order_detail_v8_di_ab_test/dt=20230327/country=SG/part-00179-51723c84-5be6-428e-862e-c4716e9536cc.c000/ 23/03/27 05:11:08 DEBUG Hive: The source path is /projects/sellerdata_ods/hive/sellerdata_ods/ods_shopee_order_detail_v8_di_ab_test/.hive-staging_hive_2023-03-27_05-09-17_848_8941157515106120269-1/-ext-1/dt=20230327/country=SG/part-00179-51723c84-5be6-428e-862e-c4716e9536cc.c000/ and the destination path is /projects/sellerdata_ods/hive/sellerdata_ods/ods_shopee_order_detail_v8_di_ab_test/dt=20230327/country=SG/part-00179-51723c84-5be6-428e-862e-c4716e9536cc.c000/ 23/03/27 05:11:08 INFO Hive: Renaming src: hdfs://R2/projects/sellerdata_ods/hive/sellerdata_ods/ods_shopee_order_detail_v8_di_ab_test/.hive-staging_hive_2023-03-27_05-09-17_848_8941157515106120269-1/-ext-1/dt=20230327/country=SG/part-00179-51723c84-5be6-428e-862e-c4716e9536cc.c000, dest: hdfs://R2/projects/sellerdata_ods/hive/sellerdata_ods/ods_shopee_order_detail_v8_di_ab_test/dt=20230327/country=SG/part-00179-51723c84-5be6-428e-862e-c4716e9536cc.c000, Status:true 23/03/27 05:11:09 DEBUG Hive: altering partition for table ods_shopee_order_detail_v8_di_ab_test with partition spec : {dt=20230327, country=SG} ` ### Does this PR introduce _any_ user-facing change? NO ### How was this patch tested? No need Issue Time Tracking --- Worklog Id: (was: 853644) Remaining Estimate: 0h Time Spent: 10m > Remove duplicate info log in Hive.isSubDIr > -- > > Key: HIVE-27189 > URL: https://issues.apache.org/jira/browse/HIVE-27189 > Project: Hive > Issue Type: Improvement >Reporter: shuyouZZ >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > In class {{{}org.apache.hadoop.hive.ql.metadata.HIve{}}}, invoke method > {{isSubDir}} will print twice > {code:java} > LOG.debug("The source path is " + fullF1 + " and the destination path is " + > fullF2);{code} > we should remove the duplicate info log. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HIVE-27190) Implement col stats cache for hive iceberg table
[ https://issues.apache.org/jira/browse/HIVE-27190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simhadri Govindappa reassigned HIVE-27190: -- Assignee: Simhadri Govindappa > Implement col stats cache for hive iceberg table > - > > Key: HIVE-27190 > URL: https://issues.apache.org/jira/browse/HIVE-27190 > Project: Hive > Issue Type: Improvement >Reporter: Simhadri Govindappa >Assignee: Simhadri Govindappa >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26809) Upgrade ORC to 1.8.3
[ https://issues.apache.org/jira/browse/HIVE-26809?focusedWorklogId=853643=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853643 ] ASF GitHub Bot logged work on HIVE-26809: - Author: ASF GitHub Bot Created on: 29/Mar/23 10:03 Start Date: 29/Mar/23 10:03 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #4121: URL: https://github.com/apache/hive/pull/4121#issuecomment-1488304670 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive=4121) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=4121=false=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=4121=false=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive=4121=false=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=4121=false=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=4121=false=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=4121=false=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4121=false=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4121=false=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4121=false=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive=4121=false=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=4121=false=CODE_SMELL) [2 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive=4121=false=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive=4121=coverage=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive=4121=duplicated_lines_density=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 853643) Time Spent: 8.5h (was: 8h 20m) > Upgrade ORC to 1.8.3 > > > Key: HIVE-26809 > URL: https://issues.apache.org/jira/browse/HIVE-26809 > Project: Hive > Issue Type: Improvement >Affects Versions: 4.0.0 >Reporter: Dmitriy Fingerman >Assignee: Zoltán Rátkai >Priority: Major > Labels: pull-request-available > Time Spent: 8.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-27135) AcidUtils#getHdfsDirSnapshots() throws FNFE when a directory is removed in HDFS
[ https://issues.apache.org/jira/browse/HIVE-27135?focusedWorklogId=853642=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853642 ] ASF GitHub Bot logged work on HIVE-27135: - Author: ASF GitHub Bot Created on: 29/Mar/23 10:00 Start Date: 29/Mar/23 10:00 Worklog Time Spent: 10m Work Description: deniskuzZ commented on code in PR #4114: URL: https://github.com/apache/hive/pull/4114#discussion_r1151692776 ## ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java: ## @@ -1538,32 +1538,36 @@ private static HdfsDirSnapshot addToSnapshot(Map dirToSna public static Map getHdfsDirSnapshots(final FileSystem fs, final Path path) throws IOException { Map dirToSnapshots = new HashMap<>(); -RemoteIterator itr = FileUtils.listFiles(fs, path, true, acidHiddenFileFilter); -while (itr.hasNext()) { - FileStatus fStatus = itr.next(); - Path fPath = fStatus.getPath(); - if (fStatus.isDirectory() && acidTempDirFilter.accept(fPath)) { -addToSnapshot(dirToSnapshots, fPath); - } else { -Path parentDirPath = fPath.getParent(); -if (acidTempDirFilter.accept(parentDirPath)) { - while (isChildOfDelta(parentDirPath, path)) { -// Some cases there are other directory layers between the delta and the datafiles -// (export-import mm table, insert with union all to mm table, skewed tables). -// But it does not matter for the AcidState, we just need the deltas and the data files -// So build the snapshot with the files inside the delta directory -parentDirPath = parentDirPath.getParent(); - } - HdfsDirSnapshot dirSnapshot = addToSnapshot(dirToSnapshots, parentDirPath); - // We're not filtering out the metadata file and acid format file, - // as they represent parts of a valid snapshot - // We're not using the cached values downstream, but we can potentially optimize more in a follow-up task - if (fStatus.getPath().toString().contains(MetaDataFile.METADATA_FILE)) { -dirSnapshot.addMetadataFile(fStatus); - } else if (fStatus.getPath().toString().contains(OrcAcidVersion.ACID_FORMAT)) { -dirSnapshot.addOrcAcidFormatFile(fStatus); - } else { -dirSnapshot.addFile(fStatus); +Deque> stack = new ArrayDeque<>(); Review Comment: please update the PR description Issue Time Tracking --- Worklog Id: (was: 853642) Time Spent: 6h (was: 5h 50m) > AcidUtils#getHdfsDirSnapshots() throws FNFE when a directory is removed in > HDFS > --- > > Key: HIVE-27135 > URL: https://issues.apache.org/jira/browse/HIVE-27135 > Project: Hive > Issue Type: Bug >Reporter: Dayakar M >Assignee: Dayakar M >Priority: Major > Labels: pull-request-available > Time Spent: 6h > Remaining Estimate: 0h > > AcidUtils#getHdfsDirSnapshots() throws FileNotFoundException when a directory > is removed in HDFS while fetching HDFS Snapshots. > Below testcode can be used to reproduce this issue. > {code:java} > @Test > public void > testShouldNotThrowFNFEWhenHiveStagingDirectoryIsRemovedWhileFetchingHDFSSnapshots() > throws Exception { > MockFileSystem fs = new MockFileSystem(new HiveConf(), > new MockFile("mock:/tbl/part1/.hive-staging_dir/-ext-10002", 500, new > byte[0]), > new MockFile("mock:/tbl/part2/.hive-staging_dir", 500, new byte[0]), > new MockFile("mock:/tbl/part1/_tmp_space.db", 500, new byte[0]), > new MockFile("mock:/tbl/part1/delta_1_1/bucket--", 500, new > byte[0])); > Path path = new MockPath(fs, "/tbl"); > Path stageDir = new MockPath(fs, "mock:/tbl/part1/.hive-staging_dir"); > FileSystem mockFs = spy(fs); > Mockito.doThrow(new > FileNotFoundException("")).when(mockFs).listLocatedStatus(eq(stageDir)); > try { > Map hdfsDirSnapshots = > AcidUtils.getHdfsDirSnapshots(mockFs, path); > Assert.assertEquals(1, hdfsDirSnapshots.size()); > } > catch (FileNotFoundException fnf) { > fail("Should not throw FileNotFoundException when a directory is > removed while fetching HDFSSnapshots"); > } > }{code} > This issue got fixed as a part of HIVE-26481 but here its not fixed > completely. > [Here|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java#L1541] > FileUtils.listFiles() API which returns a RemoteIterator. > So while iterating over, it checks if it is a directory and recursive listing > then it will try to list files from that directory but if that directory is > removed by other thread/task then it throws
[jira] [Work logged] (HIVE-27135) AcidUtils#getHdfsDirSnapshots() throws FNFE when a directory is removed in HDFS
[ https://issues.apache.org/jira/browse/HIVE-27135?focusedWorklogId=853641=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853641 ] ASF GitHub Bot logged work on HIVE-27135: - Author: ASF GitHub Bot Created on: 29/Mar/23 09:58 Start Date: 29/Mar/23 09:58 Worklog Time Spent: 10m Work Description: deniskuzZ commented on code in PR #4114: URL: https://github.com/apache/hive/pull/4114#discussion_r1151683610 ## ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java: ## @@ -1538,32 +1538,36 @@ private static HdfsDirSnapshot addToSnapshot(Map dirToSna public static Map getHdfsDirSnapshots(final FileSystem fs, final Path path) throws IOException { Map dirToSnapshots = new HashMap<>(); -RemoteIterator itr = FileUtils.listFiles(fs, path, true, acidHiddenFileFilter); -while (itr.hasNext()) { - FileStatus fStatus = itr.next(); - Path fPath = fStatus.getPath(); - if (fStatus.isDirectory() && acidTempDirFilter.accept(fPath)) { -addToSnapshot(dirToSnapshots, fPath); - } else { -Path parentDirPath = fPath.getParent(); -if (acidTempDirFilter.accept(parentDirPath)) { - while (isChildOfDelta(parentDirPath, path)) { -// Some cases there are other directory layers between the delta and the datafiles -// (export-import mm table, insert with union all to mm table, skewed tables). -// But it does not matter for the AcidState, we just need the deltas and the data files -// So build the snapshot with the files inside the delta directory -parentDirPath = parentDirPath.getParent(); - } - HdfsDirSnapshot dirSnapshot = addToSnapshot(dirToSnapshots, parentDirPath); - // We're not filtering out the metadata file and acid format file, - // as they represent parts of a valid snapshot - // We're not using the cached values downstream, but we can potentially optimize more in a follow-up task - if (fStatus.getPath().toString().contains(MetaDataFile.METADATA_FILE)) { -dirSnapshot.addMetadataFile(fStatus); - } else if (fStatus.getPath().toString().contains(OrcAcidVersion.ACID_FORMAT)) { -dirSnapshot.addOrcAcidFormatFile(fStatus); - } else { -dirSnapshot.addFile(fStatus); +Deque> stack = new ArrayDeque<>(); +stack.push(FileUtils.listLocatedStatusIterator(fs, path, acidHiddenFileFilter)); +while (!stack.isEmpty()) { + RemoteIterator itr = stack.pop(); + while (itr.hasNext()) { +FileStatus fStatus = itr.next(); +Path fPath = fStatus.getPath(); +if (fStatus.isDirectory()) { + stack.push(FileUtils.listLocatedStatusIterator(fs, fPath, acidHiddenFileFilter)); Review Comment: what if the folder is empty? that was previously included if (fStatus.isDirectory() && acidTempDirFilter.accept(fPath)) { addToSnapshot(dirToSnapshots, fPath); Issue Time Tracking --- Worklog Id: (was: 853641) Time Spent: 5h 50m (was: 5h 40m) > AcidUtils#getHdfsDirSnapshots() throws FNFE when a directory is removed in > HDFS > --- > > Key: HIVE-27135 > URL: https://issues.apache.org/jira/browse/HIVE-27135 > Project: Hive > Issue Type: Bug >Reporter: Dayakar M >Assignee: Dayakar M >Priority: Major > Labels: pull-request-available > Time Spent: 5h 50m > Remaining Estimate: 0h > > AcidUtils#getHdfsDirSnapshots() throws FileNotFoundException when a directory > is removed in HDFS while fetching HDFS Snapshots. > Below testcode can be used to reproduce this issue. > {code:java} > @Test > public void > testShouldNotThrowFNFEWhenHiveStagingDirectoryIsRemovedWhileFetchingHDFSSnapshots() > throws Exception { > MockFileSystem fs = new MockFileSystem(new HiveConf(), > new MockFile("mock:/tbl/part1/.hive-staging_dir/-ext-10002", 500, new > byte[0]), > new MockFile("mock:/tbl/part2/.hive-staging_dir", 500, new byte[0]), > new MockFile("mock:/tbl/part1/_tmp_space.db", 500, new byte[0]), > new MockFile("mock:/tbl/part1/delta_1_1/bucket--", 500, new > byte[0])); > Path path = new MockPath(fs, "/tbl"); > Path stageDir = new MockPath(fs, "mock:/tbl/part1/.hive-staging_dir"); > FileSystem mockFs = spy(fs); > Mockito.doThrow(new > FileNotFoundException("")).when(mockFs).listLocatedStatus(eq(stageDir)); > try { > Map hdfsDirSnapshots = > AcidUtils.getHdfsDirSnapshots(mockFs, path); > Assert.assertEquals(1, hdfsDirSnapshots.size()); > } > catch (FileNotFoundException fnf) { > fail("Should not throw
[jira] [Work logged] (HIVE-27135) AcidUtils#getHdfsDirSnapshots() throws FNFE when a directory is removed in HDFS
[ https://issues.apache.org/jira/browse/HIVE-27135?focusedWorklogId=853640=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853640 ] ASF GitHub Bot logged work on HIVE-27135: - Author: ASF GitHub Bot Created on: 29/Mar/23 09:55 Start Date: 29/Mar/23 09:55 Worklog Time Spent: 10m Work Description: deniskuzZ commented on code in PR #4114: URL: https://github.com/apache/hive/pull/4114#discussion_r1151684557 ## ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java: ## @@ -1538,32 +1538,36 @@ private static HdfsDirSnapshot addToSnapshot(Map dirToSna public static Map getHdfsDirSnapshots(final FileSystem fs, final Path path) throws IOException { Map dirToSnapshots = new HashMap<>(); -RemoteIterator itr = FileUtils.listFiles(fs, path, true, acidHiddenFileFilter); -while (itr.hasNext()) { - FileStatus fStatus = itr.next(); - Path fPath = fStatus.getPath(); - if (fStatus.isDirectory() && acidTempDirFilter.accept(fPath)) { -addToSnapshot(dirToSnapshots, fPath); - } else { -Path parentDirPath = fPath.getParent(); -if (acidTempDirFilter.accept(parentDirPath)) { - while (isChildOfDelta(parentDirPath, path)) { -// Some cases there are other directory layers between the delta and the datafiles -// (export-import mm table, insert with union all to mm table, skewed tables). -// But it does not matter for the AcidState, we just need the deltas and the data files -// So build the snapshot with the files inside the delta directory -parentDirPath = parentDirPath.getParent(); - } - HdfsDirSnapshot dirSnapshot = addToSnapshot(dirToSnapshots, parentDirPath); - // We're not filtering out the metadata file and acid format file, - // as they represent parts of a valid snapshot - // We're not using the cached values downstream, but we can potentially optimize more in a follow-up task - if (fStatus.getPath().toString().contains(MetaDataFile.METADATA_FILE)) { -dirSnapshot.addMetadataFile(fStatus); - } else if (fStatus.getPath().toString().contains(OrcAcidVersion.ACID_FORMAT)) { -dirSnapshot.addOrcAcidFormatFile(fStatus); - } else { -dirSnapshot.addFile(fStatus); +Deque> stack = new ArrayDeque<>(); +stack.push(FileUtils.listLocatedStatusIterator(fs, path, acidHiddenFileFilter)); +while (!stack.isEmpty()) { + RemoteIterator itr = stack.pop(); + while (itr.hasNext()) { +FileStatus fStatus = itr.next(); +Path fPath = fStatus.getPath(); +if (fStatus.isDirectory()) { + stack.push(FileUtils.listLocatedStatusIterator(fs, fPath, acidHiddenFileFilter)); +} else { + Path parentDirPath = fPath.getParent(); + if (acidTempDirFilter.accept(parentDirPath)) { Review Comment: Issue Time Tracking --- Worklog Id: (was: 853640) Time Spent: 5h 40m (was: 5.5h) > AcidUtils#getHdfsDirSnapshots() throws FNFE when a directory is removed in > HDFS > --- > > Key: HIVE-27135 > URL: https://issues.apache.org/jira/browse/HIVE-27135 > Project: Hive > Issue Type: Bug >Reporter: Dayakar M >Assignee: Dayakar M >Priority: Major > Labels: pull-request-available > Time Spent: 5h 40m > Remaining Estimate: 0h > > AcidUtils#getHdfsDirSnapshots() throws FileNotFoundException when a directory > is removed in HDFS while fetching HDFS Snapshots. > Below testcode can be used to reproduce this issue. > {code:java} > @Test > public void > testShouldNotThrowFNFEWhenHiveStagingDirectoryIsRemovedWhileFetchingHDFSSnapshots() > throws Exception { > MockFileSystem fs = new MockFileSystem(new HiveConf(), > new MockFile("mock:/tbl/part1/.hive-staging_dir/-ext-10002", 500, new > byte[0]), > new MockFile("mock:/tbl/part2/.hive-staging_dir", 500, new byte[0]), > new MockFile("mock:/tbl/part1/_tmp_space.db", 500, new byte[0]), > new MockFile("mock:/tbl/part1/delta_1_1/bucket--", 500, new > byte[0])); > Path path = new MockPath(fs, "/tbl"); > Path stageDir = new MockPath(fs, "mock:/tbl/part1/.hive-staging_dir"); > FileSystem mockFs = spy(fs); > Mockito.doThrow(new > FileNotFoundException("")).when(mockFs).listLocatedStatus(eq(stageDir)); > try { > Map hdfsDirSnapshots = > AcidUtils.getHdfsDirSnapshots(mockFs, path); > Assert.assertEquals(1, hdfsDirSnapshots.size()); > } > catch (FileNotFoundException fnf) { > fail("Should not throw FileNotFoundException when a directory is > removed while
[jira] [Work logged] (HIVE-27135) AcidUtils#getHdfsDirSnapshots() throws FNFE when a directory is removed in HDFS
[ https://issues.apache.org/jira/browse/HIVE-27135?focusedWorklogId=853639=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853639 ] ASF GitHub Bot logged work on HIVE-27135: - Author: ASF GitHub Bot Created on: 29/Mar/23 09:54 Start Date: 29/Mar/23 09:54 Worklog Time Spent: 10m Work Description: deniskuzZ commented on code in PR #4114: URL: https://github.com/apache/hive/pull/4114#discussion_r1151683610 ## ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java: ## @@ -1538,32 +1538,36 @@ private static HdfsDirSnapshot addToSnapshot(Map dirToSna public static Map getHdfsDirSnapshots(final FileSystem fs, final Path path) throws IOException { Map dirToSnapshots = new HashMap<>(); -RemoteIterator itr = FileUtils.listFiles(fs, path, true, acidHiddenFileFilter); -while (itr.hasNext()) { - FileStatus fStatus = itr.next(); - Path fPath = fStatus.getPath(); - if (fStatus.isDirectory() && acidTempDirFilter.accept(fPath)) { -addToSnapshot(dirToSnapshots, fPath); - } else { -Path parentDirPath = fPath.getParent(); -if (acidTempDirFilter.accept(parentDirPath)) { - while (isChildOfDelta(parentDirPath, path)) { -// Some cases there are other directory layers between the delta and the datafiles -// (export-import mm table, insert with union all to mm table, skewed tables). -// But it does not matter for the AcidState, we just need the deltas and the data files -// So build the snapshot with the files inside the delta directory -parentDirPath = parentDirPath.getParent(); - } - HdfsDirSnapshot dirSnapshot = addToSnapshot(dirToSnapshots, parentDirPath); - // We're not filtering out the metadata file and acid format file, - // as they represent parts of a valid snapshot - // We're not using the cached values downstream, but we can potentially optimize more in a follow-up task - if (fStatus.getPath().toString().contains(MetaDataFile.METADATA_FILE)) { -dirSnapshot.addMetadataFile(fStatus); - } else if (fStatus.getPath().toString().contains(OrcAcidVersion.ACID_FORMAT)) { -dirSnapshot.addOrcAcidFormatFile(fStatus); - } else { -dirSnapshot.addFile(fStatus); +Deque> stack = new ArrayDeque<>(); +stack.push(FileUtils.listLocatedStatusIterator(fs, path, acidHiddenFileFilter)); +while (!stack.isEmpty()) { + RemoteIterator itr = stack.pop(); + while (itr.hasNext()) { +FileStatus fStatus = itr.next(); +Path fPath = fStatus.getPath(); +if (fStatus.isDirectory()) { + stack.push(FileUtils.listLocatedStatusIterator(fs, fPath, acidHiddenFileFilter)); Review Comment: what if the folder is empty? that was previously included Issue Time Tracking --- Worklog Id: (was: 853639) Time Spent: 5.5h (was: 5h 20m) > AcidUtils#getHdfsDirSnapshots() throws FNFE when a directory is removed in > HDFS > --- > > Key: HIVE-27135 > URL: https://issues.apache.org/jira/browse/HIVE-27135 > Project: Hive > Issue Type: Bug >Reporter: Dayakar M >Assignee: Dayakar M >Priority: Major > Labels: pull-request-available > Time Spent: 5.5h > Remaining Estimate: 0h > > AcidUtils#getHdfsDirSnapshots() throws FileNotFoundException when a directory > is removed in HDFS while fetching HDFS Snapshots. > Below testcode can be used to reproduce this issue. > {code:java} > @Test > public void > testShouldNotThrowFNFEWhenHiveStagingDirectoryIsRemovedWhileFetchingHDFSSnapshots() > throws Exception { > MockFileSystem fs = new MockFileSystem(new HiveConf(), > new MockFile("mock:/tbl/part1/.hive-staging_dir/-ext-10002", 500, new > byte[0]), > new MockFile("mock:/tbl/part2/.hive-staging_dir", 500, new byte[0]), > new MockFile("mock:/tbl/part1/_tmp_space.db", 500, new byte[0]), > new MockFile("mock:/tbl/part1/delta_1_1/bucket--", 500, new > byte[0])); > Path path = new MockPath(fs, "/tbl"); > Path stageDir = new MockPath(fs, "mock:/tbl/part1/.hive-staging_dir"); > FileSystem mockFs = spy(fs); > Mockito.doThrow(new > FileNotFoundException("")).when(mockFs).listLocatedStatus(eq(stageDir)); > try { > Map hdfsDirSnapshots = > AcidUtils.getHdfsDirSnapshots(mockFs, path); > Assert.assertEquals(1, hdfsDirSnapshots.size()); > } > catch (FileNotFoundException fnf) { > fail("Should not throw FileNotFoundException when a directory is > removed while fetching HDFSSnapshots"); > } > }{code} > This issue got fixed as a part of
[jira] [Work logged] (HIVE-27135) AcidUtils#getHdfsDirSnapshots() throws FNFE when a directory is removed in HDFS
[ https://issues.apache.org/jira/browse/HIVE-27135?focusedWorklogId=853638=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853638 ] ASF GitHub Bot logged work on HIVE-27135: - Author: ASF GitHub Bot Created on: 29/Mar/23 09:51 Start Date: 29/Mar/23 09:51 Worklog Time Spent: 10m Work Description: deniskuzZ commented on code in PR #4114: URL: https://github.com/apache/hive/pull/4114#discussion_r1151679605 ## common/src/java/org/apache/hadoop/hive/common/FileUtils.java: ## @@ -1376,6 +1376,12 @@ public static RemoteIterator listStatusIterator(FileSystem fs, Path status -> filter.accept(status.getPath())); } + public static RemoteIterator listLocatedStatusIterator(FileSystem fs, Path path, PathFilter filter) Review Comment: it's not required, FileStatus object is enough Issue Time Tracking --- Worklog Id: (was: 853638) Time Spent: 5h 20m (was: 5h 10m) > AcidUtils#getHdfsDirSnapshots() throws FNFE when a directory is removed in > HDFS > --- > > Key: HIVE-27135 > URL: https://issues.apache.org/jira/browse/HIVE-27135 > Project: Hive > Issue Type: Bug >Reporter: Dayakar M >Assignee: Dayakar M >Priority: Major > Labels: pull-request-available > Time Spent: 5h 20m > Remaining Estimate: 0h > > AcidUtils#getHdfsDirSnapshots() throws FileNotFoundException when a directory > is removed in HDFS while fetching HDFS Snapshots. > Below testcode can be used to reproduce this issue. > {code:java} > @Test > public void > testShouldNotThrowFNFEWhenHiveStagingDirectoryIsRemovedWhileFetchingHDFSSnapshots() > throws Exception { > MockFileSystem fs = new MockFileSystem(new HiveConf(), > new MockFile("mock:/tbl/part1/.hive-staging_dir/-ext-10002", 500, new > byte[0]), > new MockFile("mock:/tbl/part2/.hive-staging_dir", 500, new byte[0]), > new MockFile("mock:/tbl/part1/_tmp_space.db", 500, new byte[0]), > new MockFile("mock:/tbl/part1/delta_1_1/bucket--", 500, new > byte[0])); > Path path = new MockPath(fs, "/tbl"); > Path stageDir = new MockPath(fs, "mock:/tbl/part1/.hive-staging_dir"); > FileSystem mockFs = spy(fs); > Mockito.doThrow(new > FileNotFoundException("")).when(mockFs).listLocatedStatus(eq(stageDir)); > try { > Map hdfsDirSnapshots = > AcidUtils.getHdfsDirSnapshots(mockFs, path); > Assert.assertEquals(1, hdfsDirSnapshots.size()); > } > catch (FileNotFoundException fnf) { > fail("Should not throw FileNotFoundException when a directory is > removed while fetching HDFSSnapshots"); > } > }{code} > This issue got fixed as a part of HIVE-26481 but here its not fixed > completely. > [Here|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java#L1541] > FileUtils.listFiles() API which returns a RemoteIterator. > So while iterating over, it checks if it is a directory and recursive listing > then it will try to list files from that directory but if that directory is > removed by other thread/task then it throws FileNotFoundException. Here the > directory which got removed is the .staging directory which needs to be > excluded through by using passed filter. > > So here we can use same logic written in > _org.apache.hadoop.hive.ql.io.AcidUtils#getHdfsDirSnapshotsForCleaner()_ API > to avoid FileNotFoundException. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-27189) Remove duplicate info log in Hive.isSubDIr
[ https://issues.apache.org/jira/browse/HIVE-27189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shuyouZZ updated HIVE-27189: Description: In class {{{}org.apache.hadoop.hive.ql.metadata.HIve{}}}, invoke method {{isSubDir}} will print twice {code:java} LOG.debug("The source path is " + fullF1 + " and the destination path is " + fullF2);{code} we should remove the duplicate info log. > Remove duplicate info log in Hive.isSubDIr > -- > > Key: HIVE-27189 > URL: https://issues.apache.org/jira/browse/HIVE-27189 > Project: Hive > Issue Type: Improvement >Reporter: shuyouZZ >Priority: Major > > In class {{{}org.apache.hadoop.hive.ql.metadata.HIve{}}}, invoke method > {{isSubDir}} will print twice > {code:java} > LOG.debug("The source path is " + fullF1 + " and the destination path is " + > fullF2);{code} > we should remove the duplicate info log. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-27033) Backport of HIVE-23044: Make sure Cleaner doesn't delete delta directories for running queries
[ https://issues.apache.org/jira/browse/HIVE-27033?focusedWorklogId=853625=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853625 ] ASF GitHub Bot logged work on HIVE-27033: - Author: ASF GitHub Bot Created on: 29/Mar/23 08:42 Start Date: 29/Mar/23 08:42 Worklog Time Spent: 10m Work Description: amanraj2520 commented on PR #4027: URL: https://github.com/apache/hive/pull/4027#issuecomment-1488176389 @zabetak @abstractdog @vihangk1 Can you please approve and merge this. This already was present in Hive 3.1.3 release Issue Time Tracking --- Worklog Id: (was: 853625) Time Spent: 0.5h (was: 20m) > Backport of HIVE-23044: Make sure Cleaner doesn't delete delta directories > for running queries > -- > > Key: HIVE-27033 > URL: https://issues.apache.org/jira/browse/HIVE-27033 > Project: Hive > Issue Type: Sub-task >Reporter: Aman Raj >Assignee: Aman Raj >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-27145) Use StrictMath for remaining Math functions as followup of HIVE-23133
[ https://issues.apache.org/jira/browse/HIVE-27145?focusedWorklogId=853598=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853598 ] ASF GitHub Bot logged work on HIVE-27145: - Author: ASF GitHub Bot Created on: 29/Mar/23 07:40 Start Date: 29/Mar/23 07:40 Worklog Time Spent: 10m Work Description: kasakrisz commented on PR #4122: URL: https://github.com/apache/hive/pull/4122#issuecomment-1488093652 @rbalamohan IIUC `Math` implementation of functions affected by this PR may exploit specific microprocessor instructions if available in that platform which gives a better performance but the default implementation just calls the `StrictMath` version. The cost of the performance boost is precision. Example from the PR: `Degrees(cdecimal1)` ``` -6844.522849943508 ``` vs ``` -6844.522849944 ``` Should we give up possible performance benefits to favor precision? Could you please share your thoughts? Issue Time Tracking --- Worklog Id: (was: 853598) Time Spent: 40m (was: 0.5h) > Use StrictMath for remaining Math functions as followup of HIVE-23133 > - > > Key: HIVE-27145 > URL: https://issues.apache.org/jira/browse/HIVE-27145 > Project: Hive > Issue Type: Task > Components: UDF >Reporter: Himanshu Mishra >Assignee: Himanshu Mishra >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > [HIVE-23133|https://issues.apache.org/jira/browse/HIVE-23133] started using > {{StrictMath}} for {{cos, exp, log}} UDFs to fix QTests failing as results > vary based on hardware when using Math library. > Follow it up by using {{StrictMath}} for other Math functions that can have > same impact of underlying hardware namely, {{sin, tan, asin, acos, atan, > sqrt, pow, cbrt}}. > [JDK-4477961|https://bugs.openjdk.org/browse/JDK-4477961] (in Java 9) changed > radians and degrees calculation leading to Q Test failures when tests are run > on Java 9+, fix such tests. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-27187) Incremental rebuild of materialized view having aggregate and stored by iceberg
[ https://issues.apache.org/jira/browse/HIVE-27187?focusedWorklogId=853595=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853595 ] ASF GitHub Bot logged work on HIVE-27187: - Author: ASF GitHub Bot Created on: 29/Mar/23 07:36 Start Date: 29/Mar/23 07:36 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #4166: URL: https://github.com/apache/hive/pull/4166#issuecomment-1488089545 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive=4166) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=4166=false=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=4166=false=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive=4166=false=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=4166=false=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=4166=false=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=4166=false=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4166=false=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4166=false=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4166=false=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive=4166=false=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=4166=false=CODE_SMELL) [4 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive=4166=false=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive=4166=coverage=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive=4166=duplicated_lines_density=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 853595) Time Spent: 20m (was: 10m) > Incremental rebuild of materialized view having aggregate and stored by > iceberg > --- > > Key: HIVE-27187 > URL: https://issues.apache.org/jira/browse/HIVE-27187 > Project: Hive > Issue Type: Improvement > Components: Iceberg integration, Materialized views >Reporter: Krisztian Kasa >Assignee: Krisztian Kasa >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Currently incremental rebuild of materialized view stored by iceberg which > definition query contains aggregate operator is transformed to an insert > overwrite statement which contains a union operator if the source tables > contains insert operations only. One branch of the union scans the view the > other produces the delta. > This can be improved further: transform the statement to a multi insert > statement representing a merge statement to insert new aggregations and > update existing. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-27172) Add the HMS client connection timeout config
[ https://issues.apache.org/jira/browse/HIVE-27172?focusedWorklogId=853592=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853592 ] ASF GitHub Bot logged work on HIVE-27172: - Author: ASF GitHub Bot Created on: 29/Mar/23 07:33 Start Date: 29/Mar/23 07:33 Worklog Time Spent: 10m Work Description: wecharyu commented on PR #4150: URL: https://github.com/apache/hive/pull/4150#issuecomment-1488086581 @ayushtkn @deniskuzZ @kasakrisz: Could you help review this PR? Issue Time Tracking --- Worklog Id: (was: 853592) Time Spent: 1h 10m (was: 1h) > Add the HMS client connection timeout config > > > Key: HIVE-27172 > URL: https://issues.apache.org/jira/browse/HIVE-27172 > Project: Hive > Issue Type: Task > Components: Hive >Reporter: Wechar >Assignee: Wechar >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > Currently {{HiveMetastoreClient}} use {{CLIENT_SOCKET_TIMEOUT}} as both > socket timeout and connection timeout, it's not convenient for users to set a > smaller connection timeout. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26997) Iceberg: Vectorization gets disabled at runtime in merge-into statements
[ https://issues.apache.org/jira/browse/HIVE-26997?focusedWorklogId=853569=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853569 ] ASF GitHub Bot logged work on HIVE-26997: - Author: ASF GitHub Bot Created on: 29/Mar/23 06:28 Start Date: 29/Mar/23 06:28 Worklog Time Spent: 10m Work Description: kasakrisz commented on code in PR #4162: URL: https://github.com/apache/hive/pull/4162#discussion_r1151445817 ## iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/IcebergAcidUtil.java: ## @@ -93,10 +95,16 @@ public static Schema createFileReadSchemaWithVirtualColums(List dataCols) { -List cols = Lists.newArrayListWithCapacity(dataCols.size() + SERDE_META_COLS.size()); + public static Schema createSerdeSchemaForDelete(List dataCols, boolean partitioned, + Properties serDeProperties) { +boolean skipRowData = Boolean.parseBoolean(serDeProperties.getProperty(WriterBuilder.ICEBERG_DELETE_SKIPROWDATA, +WriterBuilder.ICEBERG_DELETE_SKIPROWDATA_DEFAULT)); +List cols = Lists.newArrayListWithCapacity( +SERDE_META_COLS.size() + (skipRowData || partitioned ? 0 : dataCols.size())); Review Comment: is it `skipRowData && !partitioned` ? ## iceberg/iceberg-handler/src/test/queries/positive/vectorized_iceberg_merge_mixed.q: ## @@ -0,0 +1,197 @@ + Issue Time Tracking --- Worklog Id: (was: 853569) Time Spent: 1h 20m (was: 1h 10m) > Iceberg: Vectorization gets disabled at runtime in merge-into statements > > > Key: HIVE-26997 > URL: https://issues.apache.org/jira/browse/HIVE-26997 > Project: Hive > Issue Type: Improvement > Components: Iceberg integration >Reporter: Rajesh Balamohan >Assignee: Zsolt Miskolczi >Priority: Major > Labels: pull-request-available > Attachments: explain_merge_into.txt > > Time Spent: 1h 20m > Remaining Estimate: 0h > > *Query:* > Think of "ssv" table as a table containing trickle feed data in the following > query. "store_sales_delete_1" is the destination table. > > {noformat} > MERGE INTO tpcds_1000_iceberg_mor_v4.store_sales_delete_1 t USING > tpcds_1000_update.ssv s ON (t.ss_item_sk = s.ss_item_sk > > AND t.ss_customer_sk=s.ss_customer_sk > > AND t.ss_sold_date_sk = "2451181" > > AND ((Floor((s.ss_item_sk) / 1000) * 1000) BETWEEN 1000 AND > 2000) > > AND s.ss_ext_discount_amt < 0.0) WHEN matched > AND t.ss_ext_discount_amt IS NULL THEN > UPDATE > SET ss_ext_discount_amt = 0.0 WHEN NOT matched THEN > INSERT (ss_sold_time_sk, > ss_item_sk, > ss_customer_sk, > ss_cdemo_sk, > ss_hdemo_sk, > ss_addr_sk, > ss_store_sk, > ss_promo_sk, > ss_ticket_number, > ss_quantity, > ss_wholesale_cost, > ss_list_price, > ss_sales_price, > ss_ext_discount_amt, > ss_ext_sales_price, > ss_ext_wholesale_cost, > ss_ext_list_price, > ss_ext_tax, > ss_coupon_amt, > ss_net_paid, > ss_net_paid_inc_tax, > ss_net_profit, > ss_sold_date_sk) > VALUES (s.ss_sold_time_sk, > s.ss_item_sk, > s.ss_customer_sk, > s.ss_cdemo_sk, > s.ss_hdemo_sk, > s.ss_addr_sk, > s.ss_store_sk, > s.ss_promo_sk, > s.ss_ticket_number, > s.ss_quantity, > s.ss_wholesale_cost, > s.ss_list_price, > s.ss_sales_price, > s.ss_ext_discount_amt, > s.ss_ext_sales_price, > s.ss_ext_wholesale_cost, > s.ss_ext_list_price, > s.ss_ext_tax, > s.ss_coupon_amt, > s.ss_net_paid, > s.ss_net_paid_inc_tax, > s.ss_net_profit, > "2451181") > {noformat} > > > *Issue:* > # Map phase is not getting vectorized due to "PARTITION_{_}SPEC{_}_ID" column > {noformat} > Map notVectorizedReason: Select expression for SELECT operator: Virtual > column PARTITION__SPEC__ID is not supported {noformat} > > 2. "Reducer 2" stage isn't vectorized. > {noformat} > Reduce notVectorizedReason: exception: java.lang.RuntimeException: Full Outer > Small Table Key Mapping duplicate column 0 in ordered column map {0=(value > column: 30, type info: int), 1=(value column: 31, type info: int)} when > adding value column 53, type
[jira] [Work logged] (HIVE-26968) SharedWorkOptimizer merges TableScan operators that have different DPP parents
[ https://issues.apache.org/jira/browse/HIVE-26968?focusedWorklogId=853567=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853567 ] ASF GitHub Bot logged work on HIVE-26968: - Author: ASF GitHub Bot Created on: 29/Mar/23 06:24 Start Date: 29/Mar/23 06:24 Worklog Time Spent: 10m Work Description: ngsg commented on PR #3981: URL: https://github.com/apache/hive/pull/3981#issuecomment-1488012344 Hello @zabetak. I have added a new qfile, which validates my PR. In a nutshell, this qfile submits the same query twice while varying the value of hive.optimize.shared.work.dppunion. I checked that current Hive produces different results as I described in the JIRA issue (https://issues.apache.org/jira/browse/HIVE-26968). Could you please review the changes? Thank you. Issue Time Tracking --- Worklog Id: (was: 853567) Time Spent: 40m (was: 0.5h) > SharedWorkOptimizer merges TableScan operators that have different DPP parents > -- > > Key: HIVE-26968 > URL: https://issues.apache.org/jira/browse/HIVE-26968 > Project: Hive > Issue Type: Sub-task >Affects Versions: 4.0.0-alpha-2 >Reporter: Seonggon Namgung >Assignee: Seonggon Namgung >Priority: Critical > Labels: hive-4.0.0-must, pull-request-available > Attachments: TPC-DS Query64 OperatorGraph.pdf > > Time Spent: 40m > Remaining Estimate: 0h > > SharedWorkOptimizer merges TableScan operators that have different DPP > parents, which leads to the creation of semantically wrong query plan. > In our environment, running TPC-DS query64 on 1TB Iceberg format table > returns no rows because of this problem. (The correct result has 7094 rows.) > We use hive.optimize.shared.work=true, > hive.optimize.shared.work.extended=true, and > hive.optimize.shared.work.dppunion=false to reproduce the bug. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HIVE-27176) EXPLAIN SKEW
[ https://issues.apache.org/jira/browse/HIVE-27176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17706227#comment-17706227 ] Yuming Wang commented on HIVE-27176: +1. Our internal Spark also supports similar feature: https://issues.apache.org/jira/browse/SPARK-35837 > EXPLAIN SKEW > > > Key: HIVE-27176 > URL: https://issues.apache.org/jira/browse/HIVE-27176 > Project: Hive > Issue Type: Improvement >Reporter: László Bodor >Priority: Major > > Thinking about a new explain feature, which is actually not an explain, > instead a set of analytical queries: considering a very complicated and large > SQL statement (this below is a simple one, just for example's sake): > {code} > SELECT a FROM (SELECT b ... JOIN c on b.x = c.y) d JOIN e ON d.v = e.w > {code} > EXPLAIN SKEW under the hood should run a query like: > {code} > SELECT "b", "x", x, count (distinct b.x) as count order by count desc limit 50 > UNION ALL > SELECT "c", "y", y, count (distinct c.y) as count order by count desc limit 50 > UNION ALL > SELECT "d", "v", v count (distinct d.v) as count order by count desc limit 50 > UNION ALL > SELECT "e", "w", w, count (distinct e.w) as count order by count desc limit 50 > {code} > collecting some cardinality info about all the join columns found in the > query, so result might be like: > {code} > table_name column_name column_value count > b "x" x_skew_value1 100431234 > b "x" x_skew_value2 234 > c "y" y_skew_value1 35 > c "y" x_skew_value2 45 > c "y" x_skew_value3 42 > ... > {code} > this doesn't solve the problem, instead shows data skew immediately for > further analysis, also it doesn't suffer from incomplete stats problem, as it > really has to query data on the cluster > +1 thing to check: reducer key is not always a join column, e.g. in case of > PTF > maybe we should make a plan, and simply iterate on all reduce sink keys > instead of join columns > -- This message was sent by Atlassian Jira (v8.20.10#820010)