[jira] [Work logged] (HIVE-26809) Upgrade ORC to 1.8.3
[ https://issues.apache.org/jira/browse/HIVE-26809?focusedWorklogId=852129&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-852129 ] ASF GitHub Bot logged work on HIVE-26809: - Author: ASF GitHub Bot Created on: 22/Mar/23 01:52 Start Date: 22/Mar/23 01:52 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #4138: URL: https://github.com/apache/hive/pull/4138#issuecomment-1478811647 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=4138) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4138&resolved=false&types=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4138&resolved=false&types=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4138&resolved=false&types=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4138&resolved=false&types=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4138&resolved=false&types=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4138&resolved=false&types=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4138&resolved=false&types=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4138&resolved=false&types=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4138&resolved=false&types=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4138&resolved=false&types=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4138&resolved=false&types=CODE_SMELL) [8 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4138&resolved=false&types=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=4138&metric=coverage&view=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=4138&metric=duplicated_lines_density&view=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 852129) Time Spent: 7h 50m (was: 7h 40m) > Upgrade ORC to 1.8.3 > > > Key: HIVE-26809 > URL: https://issues.apache.org/jira/browse/HIVE-26809 > Project: Hive > Issue Type: Improvement >Affects Versions: 4.0.0 >Reporter: Dmitriy Fingerman >Assignee: Zoltán Rátkai >Priority: Major > Labels: pull-request-available > Time Spent: 7h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26809) Upgrade ORC to 1.8.3
[ https://issues.apache.org/jira/browse/HIVE-26809?focusedWorklogId=852118&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-852118 ] ASF GitHub Bot logged work on HIVE-26809: - Author: ASF GitHub Bot Created on: 21/Mar/23 21:53 Start Date: 21/Mar/23 21:53 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #4138: URL: https://github.com/apache/hive/pull/4138#issuecomment-1478634983 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=4138) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4138&resolved=false&types=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4138&resolved=false&types=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4138&resolved=false&types=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4138&resolved=false&types=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4138&resolved=false&types=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4138&resolved=false&types=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4138&resolved=false&types=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4138&resolved=false&types=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4138&resolved=false&types=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4138&resolved=false&types=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4138&resolved=false&types=CODE_SMELL) [12 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4138&resolved=false&types=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=4138&metric=coverage&view=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=4138&metric=duplicated_lines_density&view=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 852118) Time Spent: 7h 40m (was: 7.5h) > Upgrade ORC to 1.8.3 > > > Key: HIVE-26809 > URL: https://issues.apache.org/jira/browse/HIVE-26809 > Project: Hive > Issue Type: Improvement >Affects Versions: 4.0.0 >Reporter: Dmitriy Fingerman >Assignee: Zoltán Rátkai >Priority: Major > Labels: pull-request-available > Time Spent: 7h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26913) Iceberg: HiveVectorizedReader::parquetRecordReader should reuse footer information
[ https://issues.apache.org/jira/browse/HIVE-26913?focusedWorklogId=852111&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-852111 ] ASF GitHub Bot logged work on HIVE-26913: - Author: ASF GitHub Bot Created on: 21/Mar/23 21:05 Start Date: 21/Mar/23 21:05 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #4136: URL: https://github.com/apache/hive/pull/4136#issuecomment-1478578954 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=4136) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4136&resolved=false&types=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4136&resolved=false&types=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4136&resolved=false&types=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4136&resolved=false&types=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4136&resolved=false&types=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4136&resolved=false&types=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4136&resolved=false&types=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4136&resolved=false&types=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4136&resolved=false&types=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4136&resolved=false&types=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4136&resolved=false&types=CODE_SMELL) [0 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4136&resolved=false&types=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=4136&metric=coverage&view=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=4136&metric=duplicated_lines_density&view=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 852111) Time Spent: 1h (was: 50m) > Iceberg: HiveVectorizedReader::parquetRecordReader should reuse footer > information > -- > > Key: HIVE-26913 > URL: https://issues.apache.org/jira/browse/HIVE-26913 > Project: Hive > Issue Type: Improvement > Components: Iceberg integration >Reporter: Rajesh Balamohan >Assignee: Ayush Saxena >Priority: Major > Labels: performance, pull-request-available, stability > Fix For: 4.0.0 > > Attachments: Screenshot 2023-01-09 at 4.01.14 PM.png > > Time Spent: 1h > Remaining Estimate: 0h > > HiveVectorizedReader::parquetRecordReader should reuse details of parquet > footer, instead of reading it again. > > It reads parquet footer here: > [https://github.com/apache/hive/blob/master/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/vector/HiveVectorizedReader.java#L230-L232] > Again it reads the footer here for constructing vectorized recordreader > [https:
[jira] [Assigned] (HIVE-27163) Column stats are not getting published after an insert query into an external table with custom location
[ https://issues.apache.org/jira/browse/HIVE-27163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naveen Gangam reassigned HIVE-27163: Assignee: Zhihua Deng > Column stats are not getting published after an insert query into an external > table with custom location > > > Key: HIVE-27163 > URL: https://issues.apache.org/jira/browse/HIVE-27163 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Taraka Rama Rao Lethavadla >Assignee: Zhihua Deng >Priority: Major > > Test case details are below > *test.q* > {noformat} > set hive.stats.column.autogather=true; > set hive.stats.autogather=true; > dfs ${system:test.dfs.mkdir} ${system:test.tmp.dir}/test; > create external table test_custom(age int, name string) stored as orc > location '/tmp/test'; > insert into test_custom select 1, 'test'; > desc formatted test_custom age;{noformat} > *test.q.out* > > > {noformat} > A masked pattern was here > PREHOOK: type: CREATETABLE > A masked pattern was here > PREHOOK: Output: database:default > PREHOOK: Output: default@test_custom > A masked pattern was here > POSTHOOK: type: CREATETABLE > A masked pattern was here > POSTHOOK: Output: database:default > POSTHOOK: Output: default@test_custom > PREHOOK: query: insert into test_custom select 1, 'test' > PREHOOK: type: QUERY > PREHOOK: Input: _dummy_database@_dummy_table > PREHOOK: Output: default@test_custom > POSTHOOK: query: insert into test_custom select 1, 'test' > POSTHOOK: type: QUERY > POSTHOOK: Input: _dummy_database@_dummy_table > POSTHOOK: Output: default@test_custom > POSTHOOK: Lineage: test_custom.age SIMPLE [] > POSTHOOK: Lineage: test_custom.name SIMPLE [] > PREHOOK: query: desc formatted test_custom age > PREHOOK: type: DESCTABLE > PREHOOK: Input: default@test_custom > POSTHOOK: query: desc formatted test_custom age > POSTHOOK: type: DESCTABLE > POSTHOOK: Input: default@test_custom > col_name age > data_type int > min > max > num_nulls > distinct_count > avg_col_len > max_col_len > num_trues > num_falses > bit_vector > comment from deserializer{noformat} > As we can see from desc formatted output, column stats were not populated > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-27147) HS2 is not accessible to clients via zookeeper when hostname used is not FQDN
[ https://issues.apache.org/jira/browse/HIVE-27147?focusedWorklogId=852105&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-852105 ] ASF GitHub Bot logged work on HIVE-27147: - Author: ASF GitHub Bot Created on: 21/Mar/23 20:00 Start Date: 21/Mar/23 20:00 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #4130: URL: https://github.com/apache/hive/pull/4130#issuecomment-1478505263 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=4130) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4130&resolved=false&types=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4130&resolved=false&types=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4130&resolved=false&types=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4130&resolved=false&types=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4130&resolved=false&types=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4130&resolved=false&types=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4130&resolved=false&types=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4130&resolved=false&types=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4130&resolved=false&types=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4130&resolved=false&types=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4130&resolved=false&types=CODE_SMELL) [0 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4130&resolved=false&types=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=4130&metric=coverage&view=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=4130&metric=duplicated_lines_density&view=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 852105) Time Spent: 1h 10m (was: 1h) > HS2 is not accessible to clients via zookeeper when hostname used is not FQDN > - > > Key: HIVE-27147 > URL: https://issues.apache.org/jira/browse/HIVE-27147 > Project: Hive > Issue Type: Bug >Reporter: Venugopal Reddy K >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > HS2 is not accessible to clients via zookeeper when hostname used during > registration is InetAddress.getHostName() with JDK 11. This issue is > happening due to change in behavior on JDK 11 and it is OS specific - > [https://stackoverflow.com/questions/61898627/inetaddress-getlocalhost-gethostname-different-behavior-between-jdk-11-and-j|http://example.com/] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26809) Upgrade ORC to 1.8.3
[ https://issues.apache.org/jira/browse/HIVE-26809?focusedWorklogId=852102&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-852102 ] ASF GitHub Bot logged work on HIVE-26809: - Author: ASF GitHub Bot Created on: 21/Mar/23 19:48 Start Date: 21/Mar/23 19:48 Worklog Time Spent: 10m Work Description: difin opened a new pull request, #4138: URL: https://github.com/apache/hive/pull/4138 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? Issue Time Tracking --- Worklog Id: (was: 852102) Time Spent: 7.5h (was: 7h 20m) > Upgrade ORC to 1.8.3 > > > Key: HIVE-26809 > URL: https://issues.apache.org/jira/browse/HIVE-26809 > Project: Hive > Issue Type: Improvement >Affects Versions: 4.0.0 >Reporter: Dmitriy Fingerman >Assignee: Zoltán Rátkai >Priority: Major > Labels: pull-request-available > Time Spent: 7.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-27164) Create Temp Txn Table As Select is failing at tablePath validation
[ https://issues.apache.org/jira/browse/HIVE-27164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naresh P R updated HIVE-27164: -- Description: After HIVE-25303, every CTAS goes for HiveMetaStore$HMSHandler#translate_table_dryrun() call to fetch table location for CTAS queries which fails with following exception for temp tables if MetastoreDefaultTransformer is set. {code:java} 2023-03-17 16:41:23,390 INFO org.apache.hadoop.hive.metastore.MetastoreDefaultTransformer: [pool-6-thread-196]: Starting translation for CreateTable for processor HMSClient-@localhost with [EXTWRITE, EXTREAD, HIVEBUCKET2, HIVEFULLACIDREAD, HIVEFULLACIDWRITE, HIVECACHEINVALIDATE, HIVEMANAGESTATS, HIVEMANAGEDINSERTWRITE, HIVEMANAGEDINSERTREAD, HIVESQL, HIVEMQT, HIVEONLYMQTWRITE] on table test_temp 2023-03-17 16:41:23,392 ERROR org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-6-thread-196]: MetaException(message:Illegal location for managed table, it has to be within database's managed location) at org.apache.hadoop.hive.metastore.MetastoreDefaultTransformer.validateTablePaths(MetastoreDefaultTransformer.java:886) at org.apache.hadoop.hive.metastore.MetastoreDefaultTransformer.transformCreateTable(MetastoreDefaultTransformer.java:666) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.translate_table_dryrun(HiveMetaStore.java:2164) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) {code} I am able to repro this issue using attached testcase. [^mm_cttas.q] There are multiple ways to fix this issue * Have temp txn table path under db's managed location path. This will help with encryption zone paths as well. * Skip location check for temp tables at MetastoreDefaultTransformer#validateTablePaths() was: After HIVE-25303, every CTAS goes for HiveMetaStore$HMSHandler#translate_table_dryrun() call to fetch table location for CTAS queries which fails with following exception for temp tables if MetastoreDefaultTransformer is set. {code:java} 2023-03-17 16:41:23,390 INFO org.apache.hadoop.hive.metastore.MetastoreDefaultTransformer: [pool-6-thread-196]: Starting translation for CreateTable for processor HMSClient-@localhost with [EXTWRITE, EXTREAD, HIVEBUCKET2, HIVEFULLACIDREAD, HIVEFULLACIDWRITE, HIVECACHEINVALIDATE, HIVEMANAGESTATS, HIVEMANAGEDINSERTWRITE, HIVEMANAGEDINSERTREAD, HIVESQL, HIVEMQT, HIVEONLYMQTWRITE] on table test_temp 2023-03-17 16:41:23,392 ERROR org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-6-thread-196]: MetaException(message:Illegal location for managed table, it has to be within database's managed location) at org.apache.hadoop.hive.metastore.MetastoreDefaultTransformer.validateTablePaths(MetastoreDefaultTransformer.java:886) at org.apache.hadoop.hive.metastore.MetastoreDefaultTransformer.transformCreateTable(MetastoreDefaultTransformer.java:666) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.translate_table_dryrun(HiveMetaStore.java:2164) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) {code} I am able to repro this issue at apache upstream using attached testcase. [^mm_cttas.q] There are multiple ways to fix this issue * Have temp txn table path under db's managed location path. This will help with encryption zone paths as well. * Skip location check for temp tables at MetastoreDefaultTransformer#validateTablePaths() > Create Temp Txn Table As Select is failing at tablePath validation > -- > > Key: HIVE-27164 > URL: https://issues.apache.org/jira/browse/HIVE-27164 > Project: Hive > Issue Type: Bug > Components: HiveServer2, Metastore >Reporter: Naresh P R >Priority: Major > Attachments: mm_cttas.q > > > After HIVE-25303, every CTAS goes for > HiveMetaStore$HMSHandler#translate_table_dryrun() call to fetch table > location for CTAS queries which fails with following exception for temp > tables if MetastoreDefaultTransformer is set. > {code:java} > 2023-03-17 16:41:23,390 INFO > org.apache.hadoop.hive.metastore.MetastoreDefaultTransformer: > [pool-6-thread-196]: Starting translation for CreateTable for processor > HMSClient-@localhost with [EXTWRITE, EXTREAD, HIVEBUCKET2, HIVEFULLACIDREAD, > HIVEFULLACIDWRITE, HIVECACHEINVALIDATE, HIVEMANAGESTATS, > HIVEMANAGEDINSERTWRITE, HIVEMANAGEDINSERTREAD, HIVESQL, HIVEMQT, > HIVEONLYMQTWRITE] on table test_temp > 2023-03-17 16:41:23,392 ERROR > org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-6-thread-196]: > MetaException(message:Illegal location for managed table, it has to be within > database's managed location) > at > org.apache.hadoop.hive.metastore.MetastoreDefaultTransformer.validateTablePaths(MetastoreDefaultTransformer.java:886) >
[jira] [Work logged] (HIVE-27113) Increasing default for hive.thrift.client.max.message.size to 2 GB
[ https://issues.apache.org/jira/browse/HIVE-27113?focusedWorklogId=852088&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-852088 ] ASF GitHub Bot logged work on HIVE-27113: - Author: ASF GitHub Bot Created on: 21/Mar/23 18:53 Start Date: 21/Mar/23 18:53 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #4137: URL: https://github.com/apache/hive/pull/4137#issuecomment-1478424686 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=4137) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4137&resolved=false&types=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4137&resolved=false&types=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4137&resolved=false&types=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4137&resolved=false&types=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4137&resolved=false&types=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4137&resolved=false&types=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4137&resolved=false&types=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4137&resolved=false&types=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4137&resolved=false&types=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4137&resolved=false&types=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4137&resolved=false&types=CODE_SMELL) [0 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4137&resolved=false&types=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=4137&metric=coverage&view=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=4137&metric=duplicated_lines_density&view=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 852088) Time Spent: 40m (was: 0.5h) > Increasing default for hive.thrift.client.max.message.size to 2 GB > -- > > Key: HIVE-27113 > URL: https://issues.apache.org/jira/browse/HIVE-27113 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Riju Trivedi >Assignee: Riju Trivedi >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > HIVE_THRIFT_CLIENT_MAX_MESSAGE_SIZE("hive.thrift.client.max.message.size", > "1gb", > new SizeValidator(-1L, true, (long) Integer.MAX_VALUE, true), > "Thrift client configuration for max message size. 0 or -1 will use > the default defined in the Thrift " + > "library. The upper limit is 2147483648 bytes (or 2gb).") > Documentation on the help suggests setting 2147483648 while Integer Max is > 2147483647. So, it actually becomes -1 and gets set to thrift default limit > (100 MB) -- This message was sent by Atlassian Jira (v8.20.10
[jira] [Work logged] (HIVE-27157) AssertionError when inferring return type for unix_timestamp function
[ https://issues.apache.org/jira/browse/HIVE-27157?focusedWorklogId=852082&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-852082 ] ASF GitHub Bot logged work on HIVE-27157: - Author: ASF GitHub Bot Created on: 21/Mar/23 18:12 Start Date: 21/Mar/23 18:12 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #4135: URL: https://github.com/apache/hive/pull/4135#issuecomment-1478373963 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=4135) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4135&resolved=false&types=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4135&resolved=false&types=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4135&resolved=false&types=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4135&resolved=false&types=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4135&resolved=false&types=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4135&resolved=false&types=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4135&resolved=false&types=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4135&resolved=false&types=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4135&resolved=false&types=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4135&resolved=false&types=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4135&resolved=false&types=CODE_SMELL) [2 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4135&resolved=false&types=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=4135&metric=coverage&view=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=4135&metric=duplicated_lines_density&view=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 852082) Time Spent: 20m (was: 10m) > AssertionError when inferring return type for unix_timestamp function > - > > Key: HIVE-27157 > URL: https://issues.apache.org/jira/browse/HIVE-27157 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 4.0.0-alpha-2 >Reporter: Stamatis Zampetakis >Assignee: Stamatis Zampetakis >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Any attempt to derive the return data type for the {{unix_timestamp}} > function results into the following assertion error. > {noformat} > java.lang.AssertionError: typeName.allowsPrecScale(true, false): BIGINT > at > org.apache.calcite.sql.type.BasicSqlType.checkPrecScale(BasicSqlType.java:65) > at org.apache.calcite.sql.type.BasicSqlType.(BasicSqlType.java:81) > at > org.apache.calcite.sql.type.SqlTypeFactoryImpl.createSqlType(SqlTypeFactoryImpl.java:67) > at > org.apache.calc
[jira] [Updated] (HIVE-27164) Create Temp Txn Table As Select is failing at tablePath validation
[ https://issues.apache.org/jira/browse/HIVE-27164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naresh P R updated HIVE-27164: -- Description: After HIVE-25303, every CTAS goes for HiveMetaStore$HMSHandler#translate_table_dryrun() call to fetch table location for CTAS queries which fails with following exception for temp tables if MetastoreDefaultTransformer is set. {code:java} 2023-03-17 16:41:23,390 INFO org.apache.hadoop.hive.metastore.MetastoreDefaultTransformer: [pool-6-thread-196]: Starting translation for CreateTable for processor HMSClient-@localhost with [EXTWRITE, EXTREAD, HIVEBUCKET2, HIVEFULLACIDREAD, HIVEFULLACIDWRITE, HIVECACHEINVALIDATE, HIVEMANAGESTATS, HIVEMANAGEDINSERTWRITE, HIVEMANAGEDINSERTREAD, HIVESQL, HIVEMQT, HIVEONLYMQTWRITE] on table test_temp 2023-03-17 16:41:23,392 ERROR org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-6-thread-196]: MetaException(message:Illegal location for managed table, it has to be within database's managed location) at org.apache.hadoop.hive.metastore.MetastoreDefaultTransformer.validateTablePaths(MetastoreDefaultTransformer.java:886) at org.apache.hadoop.hive.metastore.MetastoreDefaultTransformer.transformCreateTable(MetastoreDefaultTransformer.java:666) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.translate_table_dryrun(HiveMetaStore.java:2164) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) {code} I am able to repro this issue at apache upstream using attached testcase. [^mm_cttas.q] There are multiple ways to fix this issue * Have temp txn table path under db's managed location path. This will help with encryption zone paths as well. * Skip location check for temp tables at MetastoreDefaultTransformer#validateTablePaths() was: After HIVE-25303, every CTAS goes for HiveMetaStore$HMSHandler#translate_table_dryrun() call to fetch table location for CTAS queries which fails with following exception for temp tables if MetastoreDefaultTransformer is set. {code:java} 2023-03-17 16:41:23,390 INFO org.apache.hadoop.hive.metastore.MetastoreDefaultTransformer: [pool-6-thread-196]: Starting translation for CreateTable for processor HMSClient-@localhost with [EXTWRITE, EXTREAD, HIVEBUCKET2, HIVEFULLACIDREAD, HIVEFULLACIDWRITE, HIVECACHEINVALIDATE, HIVEMANAGESTATS, HIVEMANAGEDINSERTWRITE, HIVEMANAGEDINSERTREAD, HIVESQL, HIVEMQT, HIVEONLYMQTWRITE] on table test_temp 2023-03-17 16:41:23,392 ERROR org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-6-thread-196]: MetaException(message:Illegal location for managed table, it has to be within database's managed location) at org.apache.hadoop.hive.metastore.MetastoreDefaultTransformer.validateTablePaths(MetastoreDefaultTransformer.java:886) at org.apache.hadoop.hive.metastore.MetastoreDefaultTransformer.transformCreateTable(MetastoreDefaultTransformer.java:666) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.translate_table_dryrun(HiveMetaStore.java:2164) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) {code} I am able to repro this issue at apache upstream using attached testcase. [^mm_cttas.q] There are multiple ways to fix this issue * Have temp txn table path under db's managed location path. This will help with encryption zone tables as well. * Skip location check for temp tables at MetastoreDefaultTransformer#validateTablePaths() > Create Temp Txn Table As Select is failing at tablePath validation > -- > > Key: HIVE-27164 > URL: https://issues.apache.org/jira/browse/HIVE-27164 > Project: Hive > Issue Type: Bug > Components: HiveServer2, Metastore >Reporter: Naresh P R >Priority: Major > Attachments: mm_cttas.q > > > After HIVE-25303, every CTAS goes for > HiveMetaStore$HMSHandler#translate_table_dryrun() call to fetch table > location for CTAS queries which fails with following exception for temp > tables if MetastoreDefaultTransformer is set. > {code:java} > 2023-03-17 16:41:23,390 INFO > org.apache.hadoop.hive.metastore.MetastoreDefaultTransformer: > [pool-6-thread-196]: Starting translation for CreateTable for processor > HMSClient-@localhost with [EXTWRITE, EXTREAD, HIVEBUCKET2, HIVEFULLACIDREAD, > HIVEFULLACIDWRITE, HIVECACHEINVALIDATE, HIVEMANAGESTATS, > HIVEMANAGEDINSERTWRITE, HIVEMANAGEDINSERTREAD, HIVESQL, HIVEMQT, > HIVEONLYMQTWRITE] on table test_temp > 2023-03-17 16:41:23,392 ERROR > org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-6-thread-196]: > MetaException(message:Illegal location for managed table, it has to be within > database's managed location) > at > org.apache.hadoop.hive.metastore.MetastoreDefaultTransformer.validateTablePaths(MetastoreDefaultTra
[jira] [Updated] (HIVE-27164) Create Temp Txn Table As Select is failing at tablePath validation
[ https://issues.apache.org/jira/browse/HIVE-27164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naresh P R updated HIVE-27164: -- Attachment: mm_cttas.q > Create Temp Txn Table As Select is failing at tablePath validation > -- > > Key: HIVE-27164 > URL: https://issues.apache.org/jira/browse/HIVE-27164 > Project: Hive > Issue Type: Bug > Components: HiveServer2, Metastore >Reporter: Naresh P R >Priority: Major > Attachments: mm_cttas.q > > > After HIVE-25303, every CTAS goes for > HiveMetaStore$HMSHandler#translate_table_dryrun() call to fetch table > location for CTAS queries which fails with following exception for temp > tables if MetastoreDefaultTransformer is set. > {code:java} > 2023-03-17 16:41:23,390 INFO > org.apache.hadoop.hive.metastore.MetastoreDefaultTransformer: > [pool-6-thread-196]: Starting translation for CreateTable for processor > HMSClient-@localhost with [EXTWRITE, EXTREAD, HIVEBUCKET2, HIVEFULLACIDREAD, > HIVEFULLACIDWRITE, HIVECACHEINVALIDATE, HIVEMANAGESTATS, > HIVEMANAGEDINSERTWRITE, HIVEMANAGEDINSERTREAD, HIVESQL, HIVEMQT, > HIVEONLYMQTWRITE] on table test_temp > 2023-03-17 16:41:23,392 ERROR > org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-6-thread-196]: > MetaException(message:Illegal location for managed table, it has to be within > database's managed location) > at > org.apache.hadoop.hive.metastore.MetastoreDefaultTransformer.validateTablePaths(MetastoreDefaultTransformer.java:886) > at > org.apache.hadoop.hive.metastore.MetastoreDefaultTransformer.transformCreateTable(MetastoreDefaultTransformer.java:666) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.translate_table_dryrun(HiveMetaStore.java:2164) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) {code} > I am able to repro this issue at apache upstream using attached testcase. > > There are multiple ways to fix this issue > * Have temp txn table path under db's managed location path. This will help > with encryption zone tables as well. > * skips location check for temp tables at > MetastoreDefaultTransformer#validateTablePaths() -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-27164) Create Temp Txn Table As Select is failing at tablePath validation
[ https://issues.apache.org/jira/browse/HIVE-27164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naresh P R updated HIVE-27164: -- Description: After HIVE-25303, every CTAS goes for HiveMetaStore$HMSHandler#translate_table_dryrun() call to fetch table location for CTAS queries which fails with following exception for temp tables if MetastoreDefaultTransformer is set. {code:java} 2023-03-17 16:41:23,390 INFO org.apache.hadoop.hive.metastore.MetastoreDefaultTransformer: [pool-6-thread-196]: Starting translation for CreateTable for processor HMSClient-@localhost with [EXTWRITE, EXTREAD, HIVEBUCKET2, HIVEFULLACIDREAD, HIVEFULLACIDWRITE, HIVECACHEINVALIDATE, HIVEMANAGESTATS, HIVEMANAGEDINSERTWRITE, HIVEMANAGEDINSERTREAD, HIVESQL, HIVEMQT, HIVEONLYMQTWRITE] on table test_temp 2023-03-17 16:41:23,392 ERROR org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-6-thread-196]: MetaException(message:Illegal location for managed table, it has to be within database's managed location) at org.apache.hadoop.hive.metastore.MetastoreDefaultTransformer.validateTablePaths(MetastoreDefaultTransformer.java:886) at org.apache.hadoop.hive.metastore.MetastoreDefaultTransformer.transformCreateTable(MetastoreDefaultTransformer.java:666) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.translate_table_dryrun(HiveMetaStore.java:2164) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) {code} I am able to repro this issue at apache upstream using attached testcase. [^mm_cttas.q] There are multiple ways to fix this issue * Have temp txn table path under db's managed location path. This will help with encryption zone tables as well. * skips location check for temp tables at MetastoreDefaultTransformer#validateTablePaths() was: After HIVE-25303, every CTAS goes for HiveMetaStore$HMSHandler#translate_table_dryrun() call to fetch table location for CTAS queries which fails with following exception for temp tables if MetastoreDefaultTransformer is set. {code:java} 2023-03-17 16:41:23,390 INFO org.apache.hadoop.hive.metastore.MetastoreDefaultTransformer: [pool-6-thread-196]: Starting translation for CreateTable for processor HMSClient-@localhost with [EXTWRITE, EXTREAD, HIVEBUCKET2, HIVEFULLACIDREAD, HIVEFULLACIDWRITE, HIVECACHEINVALIDATE, HIVEMANAGESTATS, HIVEMANAGEDINSERTWRITE, HIVEMANAGEDINSERTREAD, HIVESQL, HIVEMQT, HIVEONLYMQTWRITE] on table test_temp 2023-03-17 16:41:23,392 ERROR org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-6-thread-196]: MetaException(message:Illegal location for managed table, it has to be within database's managed location) at org.apache.hadoop.hive.metastore.MetastoreDefaultTransformer.validateTablePaths(MetastoreDefaultTransformer.java:886) at org.apache.hadoop.hive.metastore.MetastoreDefaultTransformer.transformCreateTable(MetastoreDefaultTransformer.java:666) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.translate_table_dryrun(HiveMetaStore.java:2164) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) {code} I am able to repro this issue at apache upstream using attached testcase. There are multiple ways to fix this issue * Have temp txn table path under db's managed location path. This will help with encryption zone tables as well. * skips location check for temp tables at MetastoreDefaultTransformer#validateTablePaths() > Create Temp Txn Table As Select is failing at tablePath validation > -- > > Key: HIVE-27164 > URL: https://issues.apache.org/jira/browse/HIVE-27164 > Project: Hive > Issue Type: Bug > Components: HiveServer2, Metastore >Reporter: Naresh P R >Priority: Major > Attachments: mm_cttas.q > > > After HIVE-25303, every CTAS goes for > HiveMetaStore$HMSHandler#translate_table_dryrun() call to fetch table > location for CTAS queries which fails with following exception for temp > tables if MetastoreDefaultTransformer is set. > {code:java} > 2023-03-17 16:41:23,390 INFO > org.apache.hadoop.hive.metastore.MetastoreDefaultTransformer: > [pool-6-thread-196]: Starting translation for CreateTable for processor > HMSClient-@localhost with [EXTWRITE, EXTREAD, HIVEBUCKET2, HIVEFULLACIDREAD, > HIVEFULLACIDWRITE, HIVECACHEINVALIDATE, HIVEMANAGESTATS, > HIVEMANAGEDINSERTWRITE, HIVEMANAGEDINSERTREAD, HIVESQL, HIVEMQT, > HIVEONLYMQTWRITE] on table test_temp > 2023-03-17 16:41:23,392 ERROR > org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-6-thread-196]: > MetaException(message:Illegal location for managed table, it has to be within > database's managed location) > at > org.apache.hadoop.hive.metastore.MetastoreDefaultTransformer.validateTablePaths(MetastoreDefaultTransformer.jav
[jira] [Updated] (HIVE-27164) Create Temp Txn Table As Select is failing at tablePath validation
[ https://issues.apache.org/jira/browse/HIVE-27164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naresh P R updated HIVE-27164: -- Description: After HIVE-25303, every CTAS goes for HiveMetaStore$HMSHandler#translate_table_dryrun() call to fetch table location for CTAS queries which fails with following exception for temp tables if MetastoreDefaultTransformer is set. {code:java} 2023-03-17 16:41:23,390 INFO org.apache.hadoop.hive.metastore.MetastoreDefaultTransformer: [pool-6-thread-196]: Starting translation for CreateTable for processor HMSClient-@localhost with [EXTWRITE, EXTREAD, HIVEBUCKET2, HIVEFULLACIDREAD, HIVEFULLACIDWRITE, HIVECACHEINVALIDATE, HIVEMANAGESTATS, HIVEMANAGEDINSERTWRITE, HIVEMANAGEDINSERTREAD, HIVESQL, HIVEMQT, HIVEONLYMQTWRITE] on table test_temp 2023-03-17 16:41:23,392 ERROR org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-6-thread-196]: MetaException(message:Illegal location for managed table, it has to be within database's managed location) at org.apache.hadoop.hive.metastore.MetastoreDefaultTransformer.validateTablePaths(MetastoreDefaultTransformer.java:886) at org.apache.hadoop.hive.metastore.MetastoreDefaultTransformer.transformCreateTable(MetastoreDefaultTransformer.java:666) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.translate_table_dryrun(HiveMetaStore.java:2164) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) {code} I am able to repro this issue at apache upstream using attached testcase. [^mm_cttas.q] There are multiple ways to fix this issue * Have temp txn table path under db's managed location path. This will help with encryption zone tables as well. * Skip location check for temp tables at MetastoreDefaultTransformer#validateTablePaths() was: After HIVE-25303, every CTAS goes for HiveMetaStore$HMSHandler#translate_table_dryrun() call to fetch table location for CTAS queries which fails with following exception for temp tables if MetastoreDefaultTransformer is set. {code:java} 2023-03-17 16:41:23,390 INFO org.apache.hadoop.hive.metastore.MetastoreDefaultTransformer: [pool-6-thread-196]: Starting translation for CreateTable for processor HMSClient-@localhost with [EXTWRITE, EXTREAD, HIVEBUCKET2, HIVEFULLACIDREAD, HIVEFULLACIDWRITE, HIVECACHEINVALIDATE, HIVEMANAGESTATS, HIVEMANAGEDINSERTWRITE, HIVEMANAGEDINSERTREAD, HIVESQL, HIVEMQT, HIVEONLYMQTWRITE] on table test_temp 2023-03-17 16:41:23,392 ERROR org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-6-thread-196]: MetaException(message:Illegal location for managed table, it has to be within database's managed location) at org.apache.hadoop.hive.metastore.MetastoreDefaultTransformer.validateTablePaths(MetastoreDefaultTransformer.java:886) at org.apache.hadoop.hive.metastore.MetastoreDefaultTransformer.transformCreateTable(MetastoreDefaultTransformer.java:666) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.translate_table_dryrun(HiveMetaStore.java:2164) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) {code} I am able to repro this issue at apache upstream using attached testcase. [^mm_cttas.q] There are multiple ways to fix this issue * Have temp txn table path under db's managed location path. This will help with encryption zone tables as well. * skips location check for temp tables at MetastoreDefaultTransformer#validateTablePaths() > Create Temp Txn Table As Select is failing at tablePath validation > -- > > Key: HIVE-27164 > URL: https://issues.apache.org/jira/browse/HIVE-27164 > Project: Hive > Issue Type: Bug > Components: HiveServer2, Metastore >Reporter: Naresh P R >Priority: Major > Attachments: mm_cttas.q > > > After HIVE-25303, every CTAS goes for > HiveMetaStore$HMSHandler#translate_table_dryrun() call to fetch table > location for CTAS queries which fails with following exception for temp > tables if MetastoreDefaultTransformer is set. > {code:java} > 2023-03-17 16:41:23,390 INFO > org.apache.hadoop.hive.metastore.MetastoreDefaultTransformer: > [pool-6-thread-196]: Starting translation for CreateTable for processor > HMSClient-@localhost with [EXTWRITE, EXTREAD, HIVEBUCKET2, HIVEFULLACIDREAD, > HIVEFULLACIDWRITE, HIVECACHEINVALIDATE, HIVEMANAGESTATS, > HIVEMANAGEDINSERTWRITE, HIVEMANAGEDINSERTREAD, HIVESQL, HIVEMQT, > HIVEONLYMQTWRITE] on table test_temp > 2023-03-17 16:41:23,392 ERROR > org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-6-thread-196]: > MetaException(message:Illegal location for managed table, it has to be within > database's managed location) > at > org.apache.hadoop.hive.metastore.MetastoreDefaultTransformer.validateTablePaths(MetastoreDefaultT
[jira] [Work logged] (HIVE-27113) Increasing default for hive.thrift.client.max.message.size to 2 GB
[ https://issues.apache.org/jira/browse/HIVE-27113?focusedWorklogId=852080&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-852080 ] ASF GitHub Bot logged work on HIVE-27113: - Author: ASF GitHub Bot Created on: 21/Mar/23 18:06 Start Date: 21/Mar/23 18:06 Worklog Time Spent: 10m Work Description: jfsii commented on code in PR #4137: URL: https://github.com/apache/hive/pull/4137#discussion_r1143811416 ## service/src/java/org/apache/hive/service/cli/thrift/RetryingThriftCLIServiceClient.java: ## @@ -310,7 +310,7 @@ protected synchronized TTransport connect(HiveConf conf) throws HiveSQLException String host = conf.getVar(HiveConf.ConfVars.HIVE_SERVER2_THRIFT_BIND_HOST); int port = conf.getIntVar(HiveConf.ConfVars.HIVE_SERVER2_THRIFT_PORT); -int maxThriftMessageSize = (int) conf.getSizeVar(HiveConf.ConfVars.HIVE_THRIFT_CLIENT_MAX_MESSAGE_SIZE); +int maxThriftMessageSize = (int) Math.min(conf.getSizeVar(HiveConf.ConfVars.HIVE_THRIFT_CLIENT_MAX_MESSAGE_SIZE),Integer.MAX_VALUE); Review Comment: Maybe add a new method to HiveConf like getSizeVar that takes in a long min, and a long max. (It may even make sense to have it LOG.info when the gets clamped (orig val > max, orig val < min), but I am not sure if that would make logs too chatty). Something like: getSizeVarWithRange(HiveConf.ConfVars.HIVE_THRIFT_CLIENT_MAX_MESSAGE_SIZE, Integer.MIN_VALUE, Integer.MAX_VALUE). I'm bad at naming methods, so pick something that makes sense (though maybe WithRange does?). Issue Time Tracking --- Worklog Id: (was: 852080) Time Spent: 0.5h (was: 20m) > Increasing default for hive.thrift.client.max.message.size to 2 GB > -- > > Key: HIVE-27113 > URL: https://issues.apache.org/jira/browse/HIVE-27113 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Riju Trivedi >Assignee: Riju Trivedi >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > HIVE_THRIFT_CLIENT_MAX_MESSAGE_SIZE("hive.thrift.client.max.message.size", > "1gb", > new SizeValidator(-1L, true, (long) Integer.MAX_VALUE, true), > "Thrift client configuration for max message size. 0 or -1 will use > the default defined in the Thrift " + > "library. The upper limit is 2147483648 bytes (or 2gb).") > Documentation on the help suggests setting 2147483648 while Integer Max is > 2147483647. So, it actually becomes -1 and gets set to thrift default limit > (100 MB) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-27113) Increasing default for hive.thrift.client.max.message.size to 2 GB
[ https://issues.apache.org/jira/browse/HIVE-27113?focusedWorklogId=852078&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-852078 ] ASF GitHub Bot logged work on HIVE-27113: - Author: ASF GitHub Bot Created on: 21/Mar/23 17:59 Start Date: 21/Mar/23 17:59 Worklog Time Spent: 10m Work Description: jfsii commented on code in PR #4137: URL: https://github.com/apache/hive/pull/4137#discussion_r1143804658 ## service/src/java/org/apache/hive/service/cli/thrift/RetryingThriftCLIServiceClient.java: ## @@ -310,7 +310,7 @@ protected synchronized TTransport connect(HiveConf conf) throws HiveSQLException String host = conf.getVar(HiveConf.ConfVars.HIVE_SERVER2_THRIFT_BIND_HOST); int port = conf.getIntVar(HiveConf.ConfVars.HIVE_SERVER2_THRIFT_PORT); -int maxThriftMessageSize = (int) conf.getSizeVar(HiveConf.ConfVars.HIVE_THRIFT_CLIENT_MAX_MESSAGE_SIZE); +int maxThriftMessageSize = (int) Math.min(conf.getSizeVar(HiveConf.ConfVars.HIVE_THRIFT_CLIENT_MAX_MESSAGE_SIZE),Integer.MAX_VALUE); LOG.info("Connecting to " + host + ":" + port); Review Comment: It might be a good idea to also include the maxThriftMessageSize in the below LOG.info message. It would help in future debugging if needed (and that LOG.info line already exists, so we likely won't be adding a very chatty message). Issue Time Tracking --- Worklog Id: (was: 852078) Time Spent: 20m (was: 10m) > Increasing default for hive.thrift.client.max.message.size to 2 GB > -- > > Key: HIVE-27113 > URL: https://issues.apache.org/jira/browse/HIVE-27113 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Riju Trivedi >Assignee: Riju Trivedi >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > HIVE_THRIFT_CLIENT_MAX_MESSAGE_SIZE("hive.thrift.client.max.message.size", > "1gb", > new SizeValidator(-1L, true, (long) Integer.MAX_VALUE, true), > "Thrift client configuration for max message size. 0 or -1 will use > the default defined in the Thrift " + > "library. The upper limit is 2147483648 bytes (or 2gb).") > Documentation on the help suggests setting 2147483648 while Integer Max is > 2147483647. So, it actually becomes -1 and gets set to thrift default limit > (100 MB) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-27113) Increasing default for hive.thrift.client.max.message.size to 2 GB
[ https://issues.apache.org/jira/browse/HIVE-27113?focusedWorklogId=852073&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-852073 ] ASF GitHub Bot logged work on HIVE-27113: - Author: ASF GitHub Bot Created on: 21/Mar/23 17:46 Start Date: 21/Mar/23 17:46 Worklog Time Spent: 10m Work Description: rtrivedi12 opened a new pull request, #4137: URL: https://github.com/apache/hive/pull/4137 …e to 2 GB ### What changes were proposed in this pull request? Changed the default value for thrift message size to 2147483647 bytes instead of 1 GB and fixed config help message. Also, Wrapped the config value to a max of INTEGER.MAX_VALUE. ### Why are the changes needed? Wide tables with huge partitions (5k+) can cross current thrift max message size of 1GB as the partition object contains column descriptors and other properties set by Impala. Help message suggested the upper limit for max message size as 2 GB which is outside of INTEGER.MAX_VALUE range and hence caused overflow when converted to int and got updated to thrift default (100 MB) ### How was this patch tested? There is no functionality change introduced, so existing test cases should not be failing with these config value changes Issue Time Tracking --- Worklog Id: (was: 852073) Remaining Estimate: 0h Time Spent: 10m > Increasing default for hive.thrift.client.max.message.size to 2 GB > -- > > Key: HIVE-27113 > URL: https://issues.apache.org/jira/browse/HIVE-27113 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Riju Trivedi >Assignee: Riju Trivedi >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > HIVE_THRIFT_CLIENT_MAX_MESSAGE_SIZE("hive.thrift.client.max.message.size", > "1gb", > new SizeValidator(-1L, true, (long) Integer.MAX_VALUE, true), > "Thrift client configuration for max message size. 0 or -1 will use > the default defined in the Thrift " + > "library. The upper limit is 2147483648 bytes (or 2gb).") > Documentation on the help suggests setting 2147483648 while Integer Max is > 2147483647. So, it actually becomes -1 and gets set to thrift default limit > (100 MB) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-27113) Increasing default for hive.thrift.client.max.message.size to 2 GB
[ https://issues.apache.org/jira/browse/HIVE-27113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-27113: -- Labels: pull-request-available (was: ) > Increasing default for hive.thrift.client.max.message.size to 2 GB > -- > > Key: HIVE-27113 > URL: https://issues.apache.org/jira/browse/HIVE-27113 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Riju Trivedi >Assignee: Riju Trivedi >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > HIVE_THRIFT_CLIENT_MAX_MESSAGE_SIZE("hive.thrift.client.max.message.size", > "1gb", > new SizeValidator(-1L, true, (long) Integer.MAX_VALUE, true), > "Thrift client configuration for max message size. 0 or -1 will use > the default defined in the Thrift " + > "library. The upper limit is 2147483648 bytes (or 2gb).") > Documentation on the help suggests setting 2147483648 while Integer Max is > 2147483647. So, it actually becomes -1 and gets set to thrift default limit > (100 MB) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26913) Iceberg: HiveVectorizedReader::parquetRecordReader should reuse footer information
[ https://issues.apache.org/jira/browse/HIVE-26913?focusedWorklogId=852072&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-852072 ] ASF GitHub Bot logged work on HIVE-26913: - Author: ASF GitHub Bot Created on: 21/Mar/23 17:45 Start Date: 21/Mar/23 17:45 Worklog Time Spent: 10m Work Description: deniskuzZ opened a new pull request, #4136: URL: https://github.com/apache/hive/pull/4136 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? Issue Time Tracking --- Worklog Id: (was: 852072) Time Spent: 50m (was: 40m) > Iceberg: HiveVectorizedReader::parquetRecordReader should reuse footer > information > -- > > Key: HIVE-26913 > URL: https://issues.apache.org/jira/browse/HIVE-26913 > Project: Hive > Issue Type: Improvement > Components: Iceberg integration >Reporter: Rajesh Balamohan >Assignee: Ayush Saxena >Priority: Major > Labels: performance, pull-request-available, stability > Fix For: 4.0.0 > > Attachments: Screenshot 2023-01-09 at 4.01.14 PM.png > > Time Spent: 50m > Remaining Estimate: 0h > > HiveVectorizedReader::parquetRecordReader should reuse details of parquet > footer, instead of reading it again. > > It reads parquet footer here: > [https://github.com/apache/hive/blob/master/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/vector/HiveVectorizedReader.java#L230-L232] > Again it reads the footer here for constructing vectorized recordreader > [https://github.com/apache/hive/blob/master/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/vector/HiveVectorizedReader.java#L249] > > [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/VectorizedParquetInputFormat.java#L50] > > Check the codepath of > VectorizedParquetRecordReader::setupMetadataAndParquetSplit > [https://github.com/apache/hive/blob/6b0139188aba6a95808c8d1bec63a651ec9e4bdc/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/VectorizedParquetRecordReader.java#L180] > > It should be possible to share "ParquetMetadata" in > VectorizedParuqetRecordReader. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-27163) Column stats are not getting published after an insert query into an external table with custom location
[ https://issues.apache.org/jira/browse/HIVE-27163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Taraka Rama Rao Lethavadla updated HIVE-27163: -- Summary: Column stats are not getting published after an insert query into an external table with custom location (was: Column stats not getting published after an insert query into an external table with custom location) > Column stats are not getting published after an insert query into an external > table with custom location > > > Key: HIVE-27163 > URL: https://issues.apache.org/jira/browse/HIVE-27163 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Taraka Rama Rao Lethavadla >Priority: Major > > Test case details are below > *test.q* > {noformat} > set hive.stats.column.autogather=true; > set hive.stats.autogather=true; > dfs ${system:test.dfs.mkdir} ${system:test.tmp.dir}/test; > create external table test_custom(age int, name string) stored as orc > location '/tmp/test'; > insert into test_custom select 1, 'test'; > desc formatted test_custom age;{noformat} > *test.q.out* > > > {noformat} > A masked pattern was here > PREHOOK: type: CREATETABLE > A masked pattern was here > PREHOOK: Output: database:default > PREHOOK: Output: default@test_custom > A masked pattern was here > POSTHOOK: type: CREATETABLE > A masked pattern was here > POSTHOOK: Output: database:default > POSTHOOK: Output: default@test_custom > PREHOOK: query: insert into test_custom select 1, 'test' > PREHOOK: type: QUERY > PREHOOK: Input: _dummy_database@_dummy_table > PREHOOK: Output: default@test_custom > POSTHOOK: query: insert into test_custom select 1, 'test' > POSTHOOK: type: QUERY > POSTHOOK: Input: _dummy_database@_dummy_table > POSTHOOK: Output: default@test_custom > POSTHOOK: Lineage: test_custom.age SIMPLE [] > POSTHOOK: Lineage: test_custom.name SIMPLE [] > PREHOOK: query: desc formatted test_custom age > PREHOOK: type: DESCTABLE > PREHOOK: Input: default@test_custom > POSTHOOK: query: desc formatted test_custom age > POSTHOOK: type: DESCTABLE > POSTHOOK: Input: default@test_custom > col_name age > data_type int > min > max > num_nulls > distinct_count > avg_col_len > max_col_len > num_trues > num_falses > bit_vector > comment from deserializer{noformat} > As we can see from desc formatted output, column stats were not populated > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-27157) AssertionError when inferring return type for unix_timestamp function
[ https://issues.apache.org/jira/browse/HIVE-27157?focusedWorklogId=852051&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-852051 ] ASF GitHub Bot logged work on HIVE-27157: - Author: ASF GitHub Bot Created on: 21/Mar/23 15:59 Start Date: 21/Mar/23 15:59 Worklog Time Spent: 10m Work Description: zabetak opened a new pull request, #4135: URL: https://github.com/apache/hive/pull/4135 ### What changes were proposed in this pull request? Change the implementation of unix_timestamp operators to avoid the AssertionError and infer the return type correctly; always BIGINT. Break the inheritance relation with SqlAbstractTimeFunction and change the SqlFunctionCategory from TIMEDATE to NUMERIC; unix_timestamp is not a time function since the result is never among DATE, TIME, or TIMESTAMP. Change the operand type checker to a more truthful implementation; the type checker is not really used at the moment but it is better to have something realistic there instead of null or something completely wrong. ### Why are the changes needed? Calls to `inferReturnType` method for `unix_timestamp` operators always lead to `AssertionError`. Contrary to operand type checking and operand type inference that are not really relevant for Hive (the latter is not using the `SqlValidator` logic), the return type inference is important since it may kick in some calls to `RelBuilder/RexBuilder` APIs. Such calls exist in older versions of Hive and are widely used in Calcite's built-in rules. ### Does this PR introduce _any_ user-facing change? In this version no, but in older versions of Hive it can fix queries failing with `AssertionError`. ### How was this patch tested? ``` mvn test -Dtest=TestSqlOperatorInferReturnType ``` Issue Time Tracking --- Worklog Id: (was: 852051) Remaining Estimate: 0h Time Spent: 10m > AssertionError when inferring return type for unix_timestamp function > - > > Key: HIVE-27157 > URL: https://issues.apache.org/jira/browse/HIVE-27157 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 4.0.0-alpha-2 >Reporter: Stamatis Zampetakis >Assignee: Stamatis Zampetakis >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Any attempt to derive the return data type for the {{unix_timestamp}} > function results into the following assertion error. > {noformat} > java.lang.AssertionError: typeName.allowsPrecScale(true, false): BIGINT > at > org.apache.calcite.sql.type.BasicSqlType.checkPrecScale(BasicSqlType.java:65) > at org.apache.calcite.sql.type.BasicSqlType.(BasicSqlType.java:81) > at > org.apache.calcite.sql.type.SqlTypeFactoryImpl.createSqlType(SqlTypeFactoryImpl.java:67) > at > org.apache.calcite.sql.fun.SqlAbstractTimeFunction.inferReturnType(SqlAbstractTimeFunction.java:78) > at > org.apache.calcite.rex.RexBuilder.deriveReturnType(RexBuilder.java:278) > {noformat} > due to a faulty implementation of type inference for the respective operators: > * > [https://github.com/apache/hive/blob/52360151dc43904217e812efde1069d6225e9570/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/reloperators/HiveUnixTimestampSqlOperator.java] > * > [https://github.com/apache/hive/blob/52360151dc43904217e812efde1069d6225e9570/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/reloperators/HiveToUnixTimestampSqlOperator.java] > Although at this stage in master it is not possible to reproduce the problem > with an actual SQL query the buggy implementation must be fixed since slight > changes in the code/CBO rules may lead to methods relying on > {{{}SqlOperator.inferReturnType{}}}. > Note that in older versions of Hive it is possible to hit the AssertionError > in various ways. For example in Hive 3.1.3 (and older), the error may come > from > [HiveRelDecorrelator|https://github.com/apache/hive/blob/4df4d75bf1e16fe0af75aad0b4179c34c07fc975/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelDecorrelator.java#L1933] > in the presence of sub-queries. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-27157) AssertionError when inferring return type for unix_timestamp function
[ https://issues.apache.org/jira/browse/HIVE-27157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-27157: -- Labels: pull-request-available (was: ) > AssertionError when inferring return type for unix_timestamp function > - > > Key: HIVE-27157 > URL: https://issues.apache.org/jira/browse/HIVE-27157 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 4.0.0-alpha-2 >Reporter: Stamatis Zampetakis >Assignee: Stamatis Zampetakis >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Any attempt to derive the return data type for the {{unix_timestamp}} > function results into the following assertion error. > {noformat} > java.lang.AssertionError: typeName.allowsPrecScale(true, false): BIGINT > at > org.apache.calcite.sql.type.BasicSqlType.checkPrecScale(BasicSqlType.java:65) > at org.apache.calcite.sql.type.BasicSqlType.(BasicSqlType.java:81) > at > org.apache.calcite.sql.type.SqlTypeFactoryImpl.createSqlType(SqlTypeFactoryImpl.java:67) > at > org.apache.calcite.sql.fun.SqlAbstractTimeFunction.inferReturnType(SqlAbstractTimeFunction.java:78) > at > org.apache.calcite.rex.RexBuilder.deriveReturnType(RexBuilder.java:278) > {noformat} > due to a faulty implementation of type inference for the respective operators: > * > [https://github.com/apache/hive/blob/52360151dc43904217e812efde1069d6225e9570/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/reloperators/HiveUnixTimestampSqlOperator.java] > * > [https://github.com/apache/hive/blob/52360151dc43904217e812efde1069d6225e9570/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/reloperators/HiveToUnixTimestampSqlOperator.java] > Although at this stage in master it is not possible to reproduce the problem > with an actual SQL query the buggy implementation must be fixed since slight > changes in the code/CBO rules may lead to methods relying on > {{{}SqlOperator.inferReturnType{}}}. > Note that in older versions of Hive it is possible to hit the AssertionError > in various ways. For example in Hive 3.1.3 (and older), the error may come > from > [HiveRelDecorrelator|https://github.com/apache/hive/blob/4df4d75bf1e16fe0af75aad0b4179c34c07fc975/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelDecorrelator.java#L1933] > in the presence of sub-queries. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-27147) HS2 is not accessible to clients via zookeeper when hostname used is not FQDN
[ https://issues.apache.org/jira/browse/HIVE-27147?focusedWorklogId=852042&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-852042 ] ASF GitHub Bot logged work on HIVE-27147: - Author: ASF GitHub Bot Created on: 21/Mar/23 15:16 Start Date: 21/Mar/23 15:16 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #4130: URL: https://github.com/apache/hive/pull/4130#issuecomment-1478022310 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=4130) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4130&resolved=false&types=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4130&resolved=false&types=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4130&resolved=false&types=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4130&resolved=false&types=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4130&resolved=false&types=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4130&resolved=false&types=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4130&resolved=false&types=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4130&resolved=false&types=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4130&resolved=false&types=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4130&resolved=false&types=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4130&resolved=false&types=CODE_SMELL) [0 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4130&resolved=false&types=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=4130&metric=coverage&view=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=4130&metric=duplicated_lines_density&view=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 852042) Time Spent: 1h (was: 50m) > HS2 is not accessible to clients via zookeeper when hostname used is not FQDN > - > > Key: HIVE-27147 > URL: https://issues.apache.org/jira/browse/HIVE-27147 > Project: Hive > Issue Type: Bug >Reporter: Venugopal Reddy K >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > HS2 is not accessible to clients via zookeeper when hostname used during > registration is InetAddress.getHostName() with JDK 11. This issue is > happening due to change in behavior on JDK 11 and it is OS specific - > [https://stackoverflow.com/questions/61898627/inetaddress-getlocalhost-gethostname-different-behavior-between-jdk-11-and-j|http://example.com/] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26809) Upgrade ORC to 1.8.3
[ https://issues.apache.org/jira/browse/HIVE-26809?focusedWorklogId=852039&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-852039 ] ASF GitHub Bot logged work on HIVE-26809: - Author: ASF GitHub Bot Created on: 21/Mar/23 14:54 Start Date: 21/Mar/23 14:54 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #4134: URL: https://github.com/apache/hive/pull/4134#issuecomment-1477986304 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=4134) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4134&resolved=false&types=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4134&resolved=false&types=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4134&resolved=false&types=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4134&resolved=false&types=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4134&resolved=false&types=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4134&resolved=false&types=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4134&resolved=false&types=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4134&resolved=false&types=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4134&resolved=false&types=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4134&resolved=false&types=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4134&resolved=false&types=CODE_SMELL) [10 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4134&resolved=false&types=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=4134&metric=coverage&view=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=4134&metric=duplicated_lines_density&view=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 852039) Time Spent: 7h 20m (was: 7h 10m) > Upgrade ORC to 1.8.3 > > > Key: HIVE-26809 > URL: https://issues.apache.org/jira/browse/HIVE-26809 > Project: Hive > Issue Type: Improvement >Affects Versions: 4.0.0 >Reporter: Dmitriy Fingerman >Assignee: Zoltán Rátkai >Priority: Major > Labels: pull-request-available > Time Spent: 7h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HIVE-27161) MetaException when executing CTAS query in Druid storage handler
[ https://issues.apache.org/jira/browse/HIVE-27161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17703237#comment-17703237 ] Stamatis Zampetakis commented on HIVE-27161: cc [~kkasa] since you worked on HIVE-26771 > MetaException when executing CTAS query in Druid storage handler > > > Key: HIVE-27161 > URL: https://issues.apache.org/jira/browse/HIVE-27161 > Project: Hive > Issue Type: Bug > Components: Druid integration >Affects Versions: 4.0.0-alpha-2 >Reporter: Stamatis Zampetakis >Priority: Major > > Any kind of CTAS query targeting the Druid storage handler fails with the > following exception: > {noformat} > org.apache.hadoop.hive.ql.metadata.HiveException: > MetaException(message:LOCATION may not be specified for Druid) > at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:1347) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:1352) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.ddl.table.create.CreateTableOperation.createTableNonReplaceMode(CreateTableOperation.java:158) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.ddl.table.create.CreateTableOperation.execute(CreateTableOperation.java:116) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.ddl.DDLTask.execute(DDLTask.java:84) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:214) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:354) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:327) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:244) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:105) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:367) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:205) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:154) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:149) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:185) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:228) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:257) > ~[hive-cli-4.0.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.cli.CliDriver.processCmd1(CliDriver.java:201) > ~[hive-cli-4.0.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:127) > ~[hive-cli-4.0.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:425) > ~[hive-cli-4.0.0-SNAPSHOT.jar:?] > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:356) > ~[hive-cli-4.0.0-SNAPSHOT.jar:?] > at > org.apache.hadoop.hive.ql.dataset.QTestDatasetHandler.initDataset(QTestDatasetHandler.java:86) > ~[hive-it-util-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.dataset.QTestDatasetHandler.beforeTest(QTestDatasetHandler.java:190) > ~[hive-it-util-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.qoption.QTestOptionDispatcher.beforeTest(QTestOptionDispatcher.java:79) > ~[hive-it-util-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.QTestUtil.cliInit(QTestUtil.java:607) > ~[hive-it-util-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:112) > ~[hive-it-util-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:157) > ~[hive-it-util-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver(TestMiniDruidCliDriver.java:60) > ~[test-classes/:?] > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > ~[?:1.8.0_261] > at > sun.reflect.NativeMethodAccessorImp
[jira] [Work logged] (HIVE-27020) Implement a separate handler to handle aborted transaction cleanup
[ https://issues.apache.org/jira/browse/HIVE-27020?focusedWorklogId=852028&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-852028 ] ASF GitHub Bot logged work on HIVE-27020: - Author: ASF GitHub Bot Created on: 21/Mar/23 14:03 Start Date: 21/Mar/23 14:03 Worklog Time Spent: 10m Work Description: veghlaci05 commented on code in PR #4091: URL: https://github.com/apache/hive/pull/4091#discussion_r1143419790 ## ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/handler/AcidTxnCleaner.java: ## @@ -0,0 +1,125 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hive.ql.txn.compactor.handler; + +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.hive.common.ValidReadTxnList; +import org.apache.hadoop.hive.common.ValidReaderWriteIdList; +import org.apache.hadoop.hive.common.ValidTxnList; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.metastore.api.GetValidWriteIdsRequest; +import org.apache.hadoop.hive.metastore.api.GetValidWriteIdsResponse; +import org.apache.hadoop.hive.metastore.api.MetaException; +import org.apache.hadoop.hive.metastore.api.NoSuchTxnException; +import org.apache.hadoop.hive.metastore.api.Table; +import org.apache.hadoop.hive.metastore.metrics.AcidMetricService; +import org.apache.hadoop.hive.metastore.txn.AcidTxnInfo; +import org.apache.hadoop.hive.metastore.txn.TxnCommonUtils; +import org.apache.hadoop.hive.metastore.txn.TxnStore; +import org.apache.hadoop.hive.ql.io.AcidDirectory; +import org.apache.hadoop.hive.ql.io.AcidUtils; +import org.apache.hadoop.hive.ql.txn.compactor.CleanupRequest.CleanupRequestBuilder; +import org.apache.hadoop.hive.ql.txn.compactor.CompactorUtil; +import org.apache.hadoop.hive.ql.txn.compactor.FSRemover; +import org.apache.hadoop.hive.ql.txn.compactor.MetadataCache; +import org.apache.hive.common.util.Ref; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.BitSet; +import java.util.Collections; +import java.util.List; +import java.util.Map; + +import static org.apache.commons.collections.ListUtils.subtract; + +/** + * An abstract class extending TaskHandler which contains the common methods from + * CompactionCleaner and AbortedTxnCleaner. + */ +abstract class AcidTxnCleaner extends TaskHandler { Review Comment: Why not merge this class with `TaskHandler`? Both are abstract, and TaskHandler has no other subclasses. I see no reason keeping these classes separate. ## ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/handler/AbortedTxnCleaner.java: ## @@ -0,0 +1,168 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hive.ql.txn.compactor.handler; + +import org.apache.hadoop.hive.common.ValidReaderWriteIdList; +import org.apache.hadoop.hive.common.ValidTxnList; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.metastore.api.MetaException; +import org.apache.hadoop.hive.metastore.api.Partition; +import org.apache.hadoop.hive.metastore.api.Table; +import org.apache.hadoop.hive.metastore.metrics.MetricsConstants; +import org.apache.hadoop.hive.metastore.metrics.PerfLogger; +import org.apache.hadoop.hive.metastore.txn.AcidTxnInfo; +import org.apache.hadoop.hive.me
[jira] [Work logged] (HIVE-26809) Upgrade ORC to 1.8.3
[ https://issues.apache.org/jira/browse/HIVE-26809?focusedWorklogId=852021&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-852021 ] ASF GitHub Bot logged work on HIVE-26809: - Author: ASF GitHub Bot Created on: 21/Mar/23 13:49 Start Date: 21/Mar/23 13:49 Worklog Time Spent: 10m Work Description: difin opened a new pull request, #4134: URL: https://github.com/apache/hive/pull/4134 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? Issue Time Tracking --- Worklog Id: (was: 852021) Time Spent: 7h 10m (was: 7h) > Upgrade ORC to 1.8.3 > > > Key: HIVE-26809 > URL: https://issues.apache.org/jira/browse/HIVE-26809 > Project: Hive > Issue Type: Improvement >Affects Versions: 4.0.0 >Reporter: Dmitriy Fingerman >Assignee: Zoltán Rátkai >Priority: Major > Labels: pull-request-available > Time Spent: 7h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-27135) Cleaner fails with FileNotFoundException
[ https://issues.apache.org/jira/browse/HIVE-27135?focusedWorklogId=852018&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-852018 ] ASF GitHub Bot logged work on HIVE-27135: - Author: ASF GitHub Bot Created on: 21/Mar/23 13:34 Start Date: 21/Mar/23 13:34 Worklog Time Spent: 10m Work Description: veghlaci05 commented on code in PR #4114: URL: https://github.com/apache/hive/pull/4114#discussion_r1143382486 ## ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java: ## @@ -1538,32 +1538,36 @@ private static HdfsDirSnapshot addToSnapshot(Map dirToSna public static Map getHdfsDirSnapshots(final FileSystem fs, final Path path) throws IOException { Map dirToSnapshots = new HashMap<>(); -RemoteIterator itr = FileUtils.listFiles(fs, path, true, acidHiddenFileFilter); -while (itr.hasNext()) { - FileStatus fStatus = itr.next(); - Path fPath = fStatus.getPath(); - if (fStatus.isDirectory() && acidTempDirFilter.accept(fPath)) { -addToSnapshot(dirToSnapshots, fPath); - } else { -Path parentDirPath = fPath.getParent(); -if (acidTempDirFilter.accept(parentDirPath)) { - while (isChildOfDelta(parentDirPath, path)) { -// Some cases there are other directory layers between the delta and the datafiles -// (export-import mm table, insert with union all to mm table, skewed tables). -// But it does not matter for the AcidState, we just need the deltas and the data files -// So build the snapshot with the files inside the delta directory -parentDirPath = parentDirPath.getParent(); - } - HdfsDirSnapshot dirSnapshot = addToSnapshot(dirToSnapshots, parentDirPath); - // We're not filtering out the metadata file and acid format file, - // as they represent parts of a valid snapshot - // We're not using the cached values downstream, but we can potentially optimize more in a follow-up task - if (fStatus.getPath().toString().contains(MetaDataFile.METADATA_FILE)) { -dirSnapshot.addMetadataFile(fStatus); - } else if (fStatus.getPath().toString().contains(OrcAcidVersion.ACID_FORMAT)) { -dirSnapshot.addOrcAcidFormatFile(fStatus); - } else { -dirSnapshot.addFile(fStatus); +Deque> stack = new ArrayDeque<>(); Review Comment: @deniskuzZ I think this approach is fine for now, as @mdayakar mentioned `getHdfsDirSnapshotsForCleaner()` does the same. However, I would create a follow-up task to eliminate the workaround once HADOOP-18662 is merged. Issue Time Tracking --- Worklog Id: (was: 852018) Time Spent: 2.5h (was: 2h 20m) > Cleaner fails with FileNotFoundException > > > Key: HIVE-27135 > URL: https://issues.apache.org/jira/browse/HIVE-27135 > Project: Hive > Issue Type: Bug >Reporter: Dayakar M >Assignee: Dayakar M >Priority: Major > Labels: pull-request-available > Time Spent: 2.5h > Remaining Estimate: 0h > > The compaction fails when the Cleaner tried to remove a missing directory > from HDFS. > {code:java} > 2023-03-06 07:45:48,331 ERROR > org.apache.hadoop.hive.ql.txn.compactor.Cleaner: > [Cleaner-executor-thread-12]: Caught exception when cleaning, unable to > complete cleaning of > id:39762523,dbname:test,tableName:test_table,partName:null,state:,type:MINOR,enqueueTime:0,start:0,properties:null,runAs:hive,tooManyAborts:false,hasOldAbort:false,highestWriteId:989,errorMessage:null,workerId: > null,initiatorId: null java.io.FileNotFoundException: File > hdfs:/cluster/warehouse/tablespace/managed/hive/test.db/test_table/.hive-staging_hive_2023-03-06_07-45-23_120_4659605113266849995-73550 > does not exist. > at > org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1275) > at > org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1249) > at > org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1194) > at > org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1190) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.listLocatedStatus(DistributedFileSystem.java:1208) > at org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:2144) > at org.apache.hadoop.fs.FileSystem$5.handleFileStat(FileSystem.java:2332) > at org.apache.hadoop.fs.FileSystem$5.hasNext(FileSystem.java:2309) > at > org.apache.hadoop.util.functional.RemoteIterators$WrappingRemoteIterator.sourceHasNext
[jira] [Work logged] (HIVE-27151) Revert "HIVE-21685 Wrong simplification in query with multiple IN clauses"
[ https://issues.apache.org/jira/browse/HIVE-27151?focusedWorklogId=852012&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-852012 ] ASF GitHub Bot logged work on HIVE-27151: - Author: ASF GitHub Bot Created on: 21/Mar/23 13:13 Start Date: 21/Mar/23 13:13 Worklog Time Spent: 10m Work Description: vihangk1 commented on PR #4125: URL: https://github.com/apache/hive/pull/4125#issuecomment-1477817372 I submitted http://ci.hive.apache.org/job/hive-flaky-check/620/console to confirm if the new failure is a flaky test. Lets wait to see if it is indeed flaky and then we can merge this PR. Issue Time Tracking --- Worklog Id: (was: 852012) Time Spent: 40m (was: 0.5h) > Revert "HIVE-21685 Wrong simplification in query with multiple IN clauses" > -- > > Key: HIVE-27151 > URL: https://issues.apache.org/jira/browse/HIVE-27151 > Project: Hive > Issue Type: Sub-task >Reporter: Aman Raj >Assignee: Aman Raj >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > The multi_in_clause.q fails because Hive is not able to parse > explain cbo > select * from very_simple_table_for_in_test where name IN('g','r') AND name > IN('a','b') > If we want this to work, I am able to do it in my local. We have 2 options : > a. Either revert HIVE-21685 since this scenario was not validated back then > before adding this test. > b. This fix was present in https://issues.apache.org/jira/browse/HIVE-20718 > but to cherry pick this we need to cherry pick > https://issues.apache.org/jira/browse/HIVE-17040 since HIVE-20718 has a lot > of merge conflicts with HIVE-17040. But after cherry picking these we have > other failures to fix. > > I am reverting this ticket for now. > Exception stacktrace : > {code:java} > 2023-03-16 12:33:11 Completed running task attempt: > attempt_1678994907903_0001_185_01_00_02023-03-16 12:33:11 Completed Dag: > dag_1678994907903_0001_185TRACE StatusLogger Log4jLoggerFactory.getContext() > found anchor class org.apache.hadoop.hive.ql.exec.OperatorTRACE StatusLogger > Log4jLoggerFactory.getContext() found anchor class > org.apache.hadoop.hive.ql.stats.fs.FSStatsPublisherTRACE StatusLogger > Log4jLoggerFactory.getContext() found anchor class > org.apache.hadoop.hive.ql.stats.fs.FSStatsAggregatorNoViableAltException(24@[]) > at > org.apache.hadoop.hive.ql.parse.HiveParser.explainStatement(HiveParser.java:1512) > at > org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1407) > at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:230) > at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:79) at > org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:72) at > org.apache.hadoop.hive.ql.Driver.compile(Driver.java:617)at > org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1854) at > org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1801) at > org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1796) at > org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126) >at > org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:214) > at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188) at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402) at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:335) at > org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1474) >at > org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:1448) > at > org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:177) > at > org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:104) > at > org.apache.hadoop.hive.cli.split12.TestMiniLlapLocalCliDriver.testCliDriver(TestMiniLlapLocalCliDriver.java:62) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-27153) Revert "HIVE-20182: Backport HIVE-20067 to branch-3"
[ https://issues.apache.org/jira/browse/HIVE-27153?focusedWorklogId=852011&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-852011 ] ASF GitHub Bot logged work on HIVE-27153: - Author: ASF GitHub Bot Created on: 21/Mar/23 13:09 Start Date: 21/Mar/23 13:09 Worklog Time Spent: 10m Work Description: vihangk1 merged PR #4127: URL: https://github.com/apache/hive/pull/4127 Issue Time Tracking --- Worklog Id: (was: 852011) Time Spent: 50m (was: 40m) > Revert "HIVE-20182: Backport HIVE-20067 to branch-3" > > > Key: HIVE-27153 > URL: https://issues.apache.org/jira/browse/HIVE-27153 > Project: Hive > Issue Type: Sub-task >Reporter: Aman Raj >Assignee: Aman Raj >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > The mm_all.q test is failing because of this commit. This commit was not > validated before committing. > There is no stack trace for this exception. Link to the exception : > [http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-4126/2/tests] > > {code:java} > java.lang.AssertionError: Client execution failed with error code = 1 running > "insert into table part_mm_n0 partition(key_mm=455) select key from > intermediate_n0" fname=mm_all.q See ./ql/target/tmp/log/hive.log or > ./itests/qtest/target/tmp/log/hive.log, or check ./ql/target/surefire-reports > or ./itests/qtest/target/surefire-reports/ for specific test cases logs. at > org.junit.Assert.fail(Assert.java:88)at > org.apache.hadoop.hive.ql.QTestUtil.failed(QTestUtil.java:2232) at > org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:180) > at > org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:104) > at > org.apache.hadoop.hive.cli.split1.TestMiniLlapCliDriver.testCliDriver(TestMiniLlapCliDriver.java:62) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) {code} > > > Found the actual error : > {code:java} > 2023-03-19T15:18:07,705 DEBUG [699603ee-f4a1-43b7-b160-7faf858ca4b4 main] > converters.ArrayConverter: Converting 'java.net.URL[]' value > '[Ljava.net.URL;@7535f28' to type 'java.net.URL[]' > 2023-03-19T15:18:07,705 DEBUG [699603ee-f4a1-43b7-b160-7faf858ca4b4 main] > converters.ArrayConverter: No conversion required, value is already a > java.net.URL[] > 2023-03-19T15:18:07,819 INFO [699603ee-f4a1-43b7-b160-7faf858ca4b4 main] > beanutils.FluentPropertyBeanIntrospector: Error when creating > PropertyDescriptor for public final void > org.apache.commons.configuration2.AbstractConfiguration.setProperty(java.lang.String,java.lang.Object)! > Ignoring this property. > 2023-03-19T15:18:07,819 DEBUG [699603ee-f4a1-43b7-b160-7faf858ca4b4 main] > beanutils.FluentPropertyBeanIntrospector: Exception is: > java.beans.IntrospectionException: bad write method arg count: public final > void > org.apache.commons.configuration2.AbstractConfiguration.setProperty(java.lang.String,java.lang.Object) > at > java.beans.PropertyDescriptor.findPropertyType(PropertyDescriptor.java:657) > ~[?:1.8.0_342] > at > java.beans.PropertyDescriptor.setWriteMethod(PropertyDescriptor.java:327) > ~[?:1.8.0_342] > at java.beans.PropertyDescriptor.(PropertyDescriptor.java:139) > ~[?:1.8.0_342] > at > org.apache.commons.beanutils.FluentPropertyBeanIntrospector.createFluentPropertyDescritor(FluentPropertyBeanIntrospector.java:178) > ~[commons-beanutils-1.9.3.jar:1.9.3] > at > org.apache.commons.beanutils.FluentPropertyBeanIntrospector.introspect(FluentPropertyBeanIntrospector.java:141) > [commons-beanutils-1.9.3.jar:1.9.3] > at > org.apache.commons.beanutils.PropertyUtilsBean.fetchIntrospectionData(PropertyUtilsBean.java:2245) > [commons-beanutils-1.9.3.jar:1.9.3] > at > org.apache.commons.beanutils.PropertyUtilsBean.getIntrospectionData(PropertyUtilsBean.java:2226) > [commons-beanutils-1.9.3.jar:1.9.3] > at > org.apache.commons.beanutils.PropertyUtilsBean.getPropertyDescriptor(PropertyUtilsBean.java:954) > [commons-beanutils-1.9.3.jar:1.9.3] > at > org.apache.commons.beanutils.PropertyUtilsBean.isWriteable(PropertyUtilsBean.java:1478) > [commons-beanutils-1.9.3.jar:1.9.3] > at > org.apache.commons.configuration2.beanutils.BeanHelper.isPropertyWriteable(BeanHelper.java:521) > [commons-configuration2-2.1.1.jar:2.1.1] > at > org.apache.commons.configuration2.beanutils.BeanHelper.i
[jira] [Work logged] (HIVE-27153) Revert "HIVE-20182: Backport HIVE-20067 to branch-3"
[ https://issues.apache.org/jira/browse/HIVE-27153?focusedWorklogId=852010&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-852010 ] ASF GitHub Bot logged work on HIVE-27153: - Author: ASF GitHub Bot Created on: 21/Mar/23 13:08 Start Date: 21/Mar/23 13:08 Worklog Time Spent: 10m Work Description: vihangk1 commented on PR #4127: URL: https://github.com/apache/hive/pull/4127#issuecomment-1477809023 Thanks @amanraj2520 I checked if this commit is present in released 3.1.3 version and I could not find it. So I think I am good with reverting the PR. +1 Issue Time Tracking --- Worklog Id: (was: 852010) Time Spent: 40m (was: 0.5h) > Revert "HIVE-20182: Backport HIVE-20067 to branch-3" > > > Key: HIVE-27153 > URL: https://issues.apache.org/jira/browse/HIVE-27153 > Project: Hive > Issue Type: Sub-task >Reporter: Aman Raj >Assignee: Aman Raj >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > The mm_all.q test is failing because of this commit. This commit was not > validated before committing. > There is no stack trace for this exception. Link to the exception : > [http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-4126/2/tests] > > {code:java} > java.lang.AssertionError: Client execution failed with error code = 1 running > "insert into table part_mm_n0 partition(key_mm=455) select key from > intermediate_n0" fname=mm_all.q See ./ql/target/tmp/log/hive.log or > ./itests/qtest/target/tmp/log/hive.log, or check ./ql/target/surefire-reports > or ./itests/qtest/target/surefire-reports/ for specific test cases logs. at > org.junit.Assert.fail(Assert.java:88)at > org.apache.hadoop.hive.ql.QTestUtil.failed(QTestUtil.java:2232) at > org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:180) > at > org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:104) > at > org.apache.hadoop.hive.cli.split1.TestMiniLlapCliDriver.testCliDriver(TestMiniLlapCliDriver.java:62) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) {code} > > > Found the actual error : > {code:java} > 2023-03-19T15:18:07,705 DEBUG [699603ee-f4a1-43b7-b160-7faf858ca4b4 main] > converters.ArrayConverter: Converting 'java.net.URL[]' value > '[Ljava.net.URL;@7535f28' to type 'java.net.URL[]' > 2023-03-19T15:18:07,705 DEBUG [699603ee-f4a1-43b7-b160-7faf858ca4b4 main] > converters.ArrayConverter: No conversion required, value is already a > java.net.URL[] > 2023-03-19T15:18:07,819 INFO [699603ee-f4a1-43b7-b160-7faf858ca4b4 main] > beanutils.FluentPropertyBeanIntrospector: Error when creating > PropertyDescriptor for public final void > org.apache.commons.configuration2.AbstractConfiguration.setProperty(java.lang.String,java.lang.Object)! > Ignoring this property. > 2023-03-19T15:18:07,819 DEBUG [699603ee-f4a1-43b7-b160-7faf858ca4b4 main] > beanutils.FluentPropertyBeanIntrospector: Exception is: > java.beans.IntrospectionException: bad write method arg count: public final > void > org.apache.commons.configuration2.AbstractConfiguration.setProperty(java.lang.String,java.lang.Object) > at > java.beans.PropertyDescriptor.findPropertyType(PropertyDescriptor.java:657) > ~[?:1.8.0_342] > at > java.beans.PropertyDescriptor.setWriteMethod(PropertyDescriptor.java:327) > ~[?:1.8.0_342] > at java.beans.PropertyDescriptor.(PropertyDescriptor.java:139) > ~[?:1.8.0_342] > at > org.apache.commons.beanutils.FluentPropertyBeanIntrospector.createFluentPropertyDescritor(FluentPropertyBeanIntrospector.java:178) > ~[commons-beanutils-1.9.3.jar:1.9.3] > at > org.apache.commons.beanutils.FluentPropertyBeanIntrospector.introspect(FluentPropertyBeanIntrospector.java:141) > [commons-beanutils-1.9.3.jar:1.9.3] > at > org.apache.commons.beanutils.PropertyUtilsBean.fetchIntrospectionData(PropertyUtilsBean.java:2245) > [commons-beanutils-1.9.3.jar:1.9.3] > at > org.apache.commons.beanutils.PropertyUtilsBean.getIntrospectionData(PropertyUtilsBean.java:2226) > [commons-beanutils-1.9.3.jar:1.9.3] > at > org.apache.commons.beanutils.PropertyUtilsBean.getPropertyDescriptor(PropertyUtilsBean.java:954) > [commons-beanutils-1.9.3.jar:1.9.3] > at > org.apache.commons.beanutils.PropertyUtilsBean.isWriteable(PropertyUtilsBean.java:1478) > [commons-beanutils-1.9.3.jar:1.9.3] > at > org.apache.
[jira] [Work logged] (HIVE-27148) Disable TestJdbcGenericUDTFGetSplits
[ https://issues.apache.org/jira/browse/HIVE-27148?focusedWorklogId=852007&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-852007 ] ASF GitHub Bot logged work on HIVE-27148: - Author: ASF GitHub Bot Created on: 21/Mar/23 13:03 Start Date: 21/Mar/23 13:03 Worklog Time Spent: 10m Work Description: vihangk1 commented on PR #4129: URL: https://github.com/apache/hive/pull/4129#issuecomment-1477801632 @amanraj2520 Can you please review this? I am not sure why the spell checking is timing out. Issue Time Tracking --- Worklog Id: (was: 852007) Time Spent: 1.5h (was: 1h 20m) > Disable TestJdbcGenericUDTFGetSplits > > > Key: HIVE-27148 > URL: https://issues.apache.org/jira/browse/HIVE-27148 > Project: Hive > Issue Type: Sub-task > Components: Tests >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > Labels: pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > > TestJdbcGenericUDTFGetSplits is flaky and intermittently fails. > http://ci.hive.apache.org/job/hive-flaky-check/614/ -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-27135) Cleaner fails with FileNotFoundException
[ https://issues.apache.org/jira/browse/HIVE-27135?focusedWorklogId=852001&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-852001 ] ASF GitHub Bot logged work on HIVE-27135: - Author: ASF GitHub Bot Created on: 21/Mar/23 12:37 Start Date: 21/Mar/23 12:37 Worklog Time Spent: 10m Work Description: mdayakar commented on code in PR #4114: URL: https://github.com/apache/hive/pull/4114#discussion_r1143305701 ## ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java: ## @@ -1538,32 +1538,36 @@ private static HdfsDirSnapshot addToSnapshot(Map dirToSna public static Map getHdfsDirSnapshots(final FileSystem fs, final Path path) throws IOException { Map dirToSnapshots = new HashMap<>(); -RemoteIterator itr = FileUtils.listFiles(fs, path, true, acidHiddenFileFilter); -while (itr.hasNext()) { - FileStatus fStatus = itr.next(); - Path fPath = fStatus.getPath(); - if (fStatus.isDirectory() && acidTempDirFilter.accept(fPath)) { -addToSnapshot(dirToSnapshots, fPath); - } else { -Path parentDirPath = fPath.getParent(); -if (acidTempDirFilter.accept(parentDirPath)) { - while (isChildOfDelta(parentDirPath, path)) { -// Some cases there are other directory layers between the delta and the datafiles -// (export-import mm table, insert with union all to mm table, skewed tables). -// But it does not matter for the AcidState, we just need the deltas and the data files -// So build the snapshot with the files inside the delta directory -parentDirPath = parentDirPath.getParent(); - } - HdfsDirSnapshot dirSnapshot = addToSnapshot(dirToSnapshots, parentDirPath); - // We're not filtering out the metadata file and acid format file, - // as they represent parts of a valid snapshot - // We're not using the cached values downstream, but we can potentially optimize more in a follow-up task - if (fStatus.getPath().toString().contains(MetaDataFile.METADATA_FILE)) { -dirSnapshot.addMetadataFile(fStatus); - } else if (fStatus.getPath().toString().contains(OrcAcidVersion.ACID_FORMAT)) { -dirSnapshot.addOrcAcidFormatFile(fStatus); - } else { -dirSnapshot.addFile(fStatus); +Deque> stack = new ArrayDeque<>(); Review Comment: @deniskuzZ I just followed the same approach implemented in _getHdfsDirSnapshotsForCleaner()_ API. Here the main problem with the HADOOP API. Already @ayushtkn raised an issue [HADOOP-18662](https://issues.apache.org/jira/browse/HADOOP-18662) but the fix is not yet merged. Issue Time Tracking --- Worklog Id: (was: 852001) Time Spent: 2h 20m (was: 2h 10m) > Cleaner fails with FileNotFoundException > > > Key: HIVE-27135 > URL: https://issues.apache.org/jira/browse/HIVE-27135 > Project: Hive > Issue Type: Bug >Reporter: Dayakar M >Assignee: Dayakar M >Priority: Major > Labels: pull-request-available > Time Spent: 2h 20m > Remaining Estimate: 0h > > The compaction fails when the Cleaner tried to remove a missing directory > from HDFS. > {code:java} > 2023-03-06 07:45:48,331 ERROR > org.apache.hadoop.hive.ql.txn.compactor.Cleaner: > [Cleaner-executor-thread-12]: Caught exception when cleaning, unable to > complete cleaning of > id:39762523,dbname:test,tableName:test_table,partName:null,state:,type:MINOR,enqueueTime:0,start:0,properties:null,runAs:hive,tooManyAborts:false,hasOldAbort:false,highestWriteId:989,errorMessage:null,workerId: > null,initiatorId: null java.io.FileNotFoundException: File > hdfs:/cluster/warehouse/tablespace/managed/hive/test.db/test_table/.hive-staging_hive_2023-03-06_07-45-23_120_4659605113266849995-73550 > does not exist. > at > org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1275) > at > org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1249) > at > org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1194) > at > org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1190) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.listLocatedStatus(DistributedFileSystem.java:1208) > at org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:2144) > at org.apache.hadoop.fs.FileSystem$5.handleFileStat(FileSystem.java:2332) > at org.apache.hadoop.fs.FileSystem$5.hasNext(FileSystem.java:2309) > at > org.apache.hadoop.util.functio
[jira] [Resolved] (HIVE-22813) Hive query fails if table location is in remote EZ and it's readonly
[ https://issues.apache.org/jira/browse/HIVE-22813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] László Végh resolved HIVE-22813. Resolution: Fixed Merged to master, thanks [~kkasa], [~sbadhya] for the review! > Hive query fails if table location is in remote EZ and it's readonly > > > Key: HIVE-22813 > URL: https://issues.apache.org/jira/browse/HIVE-22813 > Project: Hive > Issue Type: Bug >Reporter: Purshotam Shah >Assignee: László Végh >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: HIVE-22813.patch > > Time Spent: 1h 20m > Remaining Estimate: 0h > > {code} > [purushah@gwrd352n21 ~]$ hive > hive> select * from puru_db.page_view_ez; > FAILED: SemanticException Unable to compare key strength for > hdfs://nn1/<>/puru_db_ez/page_view_ez and > hdfs://nn2:8020/tmp/puru/d558ac89-1359-424c-92ee-d0fefa8e6593/hive_2020-01-31_19-46-55_114_644945433042922-1/-mr-1 > : java.lang.IllegalArgumentException: Wrong FS: > hdfs://nn1:8020/<>/puru_db_ez/page_view_ez, expected: hdfs://nn2 > hive> > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HIVE-22813) Hive query fails if table location is in remote EZ and it's readonly
[ https://issues.apache.org/jira/browse/HIVE-22813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] László Végh reassigned HIVE-22813: -- Assignee: László Végh (was: Purshotam Shah) > Hive query fails if table location is in remote EZ and it's readonly > > > Key: HIVE-22813 > URL: https://issues.apache.org/jira/browse/HIVE-22813 > Project: Hive > Issue Type: Bug >Reporter: Purshotam Shah >Assignee: László Végh >Priority: Major > Labels: pull-request-available > Attachments: HIVE-22813.patch > > Time Spent: 1h 20m > Remaining Estimate: 0h > > {code} > [purushah@gwrd352n21 ~]$ hive > hive> select * from puru_db.page_view_ez; > FAILED: SemanticException Unable to compare key strength for > hdfs://nn1/<>/puru_db_ez/page_view_ez and > hdfs://nn2:8020/tmp/puru/d558ac89-1359-424c-92ee-d0fefa8e6593/hive_2020-01-31_19-46-55_114_644945433042922-1/-mr-1 > : java.lang.IllegalArgumentException: Wrong FS: > hdfs://nn1:8020/<>/puru_db_ez/page_view_ez, expected: hdfs://nn2 > hive> > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-27135) Cleaner fails with FileNotFoundException
[ https://issues.apache.org/jira/browse/HIVE-27135?focusedWorklogId=851999&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851999 ] ASF GitHub Bot logged work on HIVE-27135: - Author: ASF GitHub Bot Created on: 21/Mar/23 12:33 Start Date: 21/Mar/23 12:33 Worklog Time Spent: 10m Work Description: mdayakar commented on code in PR #4114: URL: https://github.com/apache/hive/pull/4114#discussion_r1143305701 ## ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java: ## @@ -1538,32 +1538,36 @@ private static HdfsDirSnapshot addToSnapshot(Map dirToSna public static Map getHdfsDirSnapshots(final FileSystem fs, final Path path) throws IOException { Map dirToSnapshots = new HashMap<>(); -RemoteIterator itr = FileUtils.listFiles(fs, path, true, acidHiddenFileFilter); -while (itr.hasNext()) { - FileStatus fStatus = itr.next(); - Path fPath = fStatus.getPath(); - if (fStatus.isDirectory() && acidTempDirFilter.accept(fPath)) { -addToSnapshot(dirToSnapshots, fPath); - } else { -Path parentDirPath = fPath.getParent(); -if (acidTempDirFilter.accept(parentDirPath)) { - while (isChildOfDelta(parentDirPath, path)) { -// Some cases there are other directory layers between the delta and the datafiles -// (export-import mm table, insert with union all to mm table, skewed tables). -// But it does not matter for the AcidState, we just need the deltas and the data files -// So build the snapshot with the files inside the delta directory -parentDirPath = parentDirPath.getParent(); - } - HdfsDirSnapshot dirSnapshot = addToSnapshot(dirToSnapshots, parentDirPath); - // We're not filtering out the metadata file and acid format file, - // as they represent parts of a valid snapshot - // We're not using the cached values downstream, but we can potentially optimize more in a follow-up task - if (fStatus.getPath().toString().contains(MetaDataFile.METADATA_FILE)) { -dirSnapshot.addMetadataFile(fStatus); - } else if (fStatus.getPath().toString().contains(OrcAcidVersion.ACID_FORMAT)) { -dirSnapshot.addOrcAcidFormatFile(fStatus); - } else { -dirSnapshot.addFile(fStatus); +Deque> stack = new ArrayDeque<>(); Review Comment: @deniskuzZ I just followed the same approach implemented in _getHdfsDirSnapshotsForCleaner()_ API. Here the main problem with the HADOOP API. Already @ayushtkn raised an issue [HADOOP-18662](https://issues.apache.org/jira/browse/HADOOP-18662) but the fix is not yet merged. If Hadoop fix is merged then no need to do any fix from Hive side. Issue Time Tracking --- Worklog Id: (was: 851999) Time Spent: 2h 10m (was: 2h) > Cleaner fails with FileNotFoundException > > > Key: HIVE-27135 > URL: https://issues.apache.org/jira/browse/HIVE-27135 > Project: Hive > Issue Type: Bug >Reporter: Dayakar M >Assignee: Dayakar M >Priority: Major > Labels: pull-request-available > Time Spent: 2h 10m > Remaining Estimate: 0h > > The compaction fails when the Cleaner tried to remove a missing directory > from HDFS. > {code:java} > 2023-03-06 07:45:48,331 ERROR > org.apache.hadoop.hive.ql.txn.compactor.Cleaner: > [Cleaner-executor-thread-12]: Caught exception when cleaning, unable to > complete cleaning of > id:39762523,dbname:test,tableName:test_table,partName:null,state:,type:MINOR,enqueueTime:0,start:0,properties:null,runAs:hive,tooManyAborts:false,hasOldAbort:false,highestWriteId:989,errorMessage:null,workerId: > null,initiatorId: null java.io.FileNotFoundException: File > hdfs:/cluster/warehouse/tablespace/managed/hive/test.db/test_table/.hive-staging_hive_2023-03-06_07-45-23_120_4659605113266849995-73550 > does not exist. > at > org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1275) > at > org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1249) > at > org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1194) > at > org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1190) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.listLocatedStatus(DistributedFileSystem.java:1208) > at org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:2144) > at org.apache.hadoop.fs.FileSystem$5.handleFileStat(FileSystem.java:2332) > at org.apache.hadoop.fs.FileSystem$5.hasNext(F
[jira] [Updated] (HIVE-22813) Hive query fails if table location is in remote EZ and it's readonly
[ https://issues.apache.org/jira/browse/HIVE-22813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] László Végh updated HIVE-22813: --- Fix Version/s: 4.0.0 Status: In Progress (was: Patch Available) > Hive query fails if table location is in remote EZ and it's readonly > > > Key: HIVE-22813 > URL: https://issues.apache.org/jira/browse/HIVE-22813 > Project: Hive > Issue Type: Bug >Reporter: Purshotam Shah >Assignee: László Végh >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: HIVE-22813.patch > > Time Spent: 1h 20m > Remaining Estimate: 0h > > {code} > [purushah@gwrd352n21 ~]$ hive > hive> select * from puru_db.page_view_ez; > FAILED: SemanticException Unable to compare key strength for > hdfs://nn1/<>/puru_db_ez/page_view_ez and > hdfs://nn2:8020/tmp/puru/d558ac89-1359-424c-92ee-d0fefa8e6593/hive_2020-01-31_19-46-55_114_644945433042922-1/-mr-1 > : java.lang.IllegalArgumentException: Wrong FS: > hdfs://nn1:8020/<>/puru_db_ez/page_view_ez, expected: hdfs://nn2 > hive> > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-22813) Hive query fails if table location is in remote EZ and it's readonly
[ https://issues.apache.org/jira/browse/HIVE-22813?focusedWorklogId=851997&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851997 ] ASF GitHub Bot logged work on HIVE-22813: - Author: ASF GitHub Bot Created on: 21/Mar/23 12:30 Start Date: 21/Mar/23 12:30 Worklog Time Spent: 10m Work Description: veghlaci05 merged PR #4112: URL: https://github.com/apache/hive/pull/4112 Issue Time Tracking --- Worklog Id: (was: 851997) Time Spent: 1h 20m (was: 1h 10m) > Hive query fails if table location is in remote EZ and it's readonly > > > Key: HIVE-22813 > URL: https://issues.apache.org/jira/browse/HIVE-22813 > Project: Hive > Issue Type: Bug >Reporter: Purshotam Shah >Assignee: Purshotam Shah >Priority: Major > Labels: pull-request-available > Attachments: HIVE-22813.patch > > Time Spent: 1h 20m > Remaining Estimate: 0h > > {code} > [purushah@gwrd352n21 ~]$ hive > hive> select * from puru_db.page_view_ez; > FAILED: SemanticException Unable to compare key strength for > hdfs://nn1/<>/puru_db_ez/page_view_ez and > hdfs://nn2:8020/tmp/puru/d558ac89-1359-424c-92ee-d0fefa8e6593/hive_2020-01-31_19-46-55_114_644945433042922-1/-mr-1 > : java.lang.IllegalArgumentException: Wrong FS: > hdfs://nn1:8020/<>/puru_db_ez/page_view_ez, expected: hdfs://nn2 > hive> > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-27147) HS2 is not accessible to clients via zookeeper when hostname used is not FQDN
[ https://issues.apache.org/jira/browse/HIVE-27147?focusedWorklogId=851992&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851992 ] ASF GitHub Bot logged work on HIVE-27147: - Author: ASF GitHub Bot Created on: 21/Mar/23 12:18 Start Date: 21/Mar/23 12:18 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #4130: URL: https://github.com/apache/hive/pull/4130#issuecomment-1477741425 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=4130) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4130&resolved=false&types=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4130&resolved=false&types=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4130&resolved=false&types=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4130&resolved=false&types=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4130&resolved=false&types=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4130&resolved=false&types=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4130&resolved=false&types=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4130&resolved=false&types=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4130&resolved=false&types=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4130&resolved=false&types=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4130&resolved=false&types=CODE_SMELL) [0 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4130&resolved=false&types=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=4130&metric=coverage&view=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=4130&metric=duplicated_lines_density&view=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 851992) Time Spent: 50m (was: 40m) > HS2 is not accessible to clients via zookeeper when hostname used is not FQDN > - > > Key: HIVE-27147 > URL: https://issues.apache.org/jira/browse/HIVE-27147 > Project: Hive > Issue Type: Bug >Reporter: Venugopal Reddy K >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > HS2 is not accessible to clients via zookeeper when hostname used during > registration is InetAddress.getHostName() with JDK 11. This issue is > happening due to change in behavior on JDK 11 and it is OS specific - > [https://stackoverflow.com/questions/61898627/inetaddress-getlocalhost-gethostname-different-behavior-between-jdk-11-and-j|http://example.com/] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-27020) Implement a separate handler to handle aborted transaction cleanup
[ https://issues.apache.org/jira/browse/HIVE-27020?focusedWorklogId=851986&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851986 ] ASF GitHub Bot logged work on HIVE-27020: - Author: ASF GitHub Bot Created on: 21/Mar/23 11:54 Start Date: 21/Mar/23 11:54 Worklog Time Spent: 10m Work Description: SourabhBadhya commented on code in PR #4091: URL: https://github.com/apache/hive/pull/4091#discussion_r1143260643 ## ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/handler/AbortedTxnCleaner.java: ## @@ -0,0 +1,168 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hive.ql.txn.compactor.handler; + +import org.apache.hadoop.hive.common.ValidReaderWriteIdList; +import org.apache.hadoop.hive.common.ValidTxnList; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.metastore.api.MetaException; +import org.apache.hadoop.hive.metastore.api.Partition; +import org.apache.hadoop.hive.metastore.api.Table; +import org.apache.hadoop.hive.metastore.metrics.MetricsConstants; +import org.apache.hadoop.hive.metastore.metrics.PerfLogger; +import org.apache.hadoop.hive.metastore.txn.AcidTxnInfo; +import org.apache.hadoop.hive.metastore.txn.TxnStore; +import org.apache.hadoop.hive.metastore.txn.TxnUtils; +import org.apache.hadoop.hive.metastore.utils.MetaStoreUtils; +import org.apache.hadoop.hive.ql.txn.compactor.CompactorUtil; +import org.apache.hadoop.hive.ql.txn.compactor.CompactorUtil.ThrowingRunnable; +import org.apache.hadoop.hive.ql.txn.compactor.FSRemover; +import org.apache.hadoop.hive.ql.txn.compactor.MetadataCache; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.Collections; +import java.util.List; +import java.util.concurrent.TimeUnit; +import java.util.stream.Collectors; + +import static java.util.Objects.isNull; + +/** + * Abort-cleanup based implementation of TaskHandler. + * Provides implementation of creation of abort clean tasks. + */ +class AbortedTxnCleaner extends AcidTxnCleaner { + + private static final Logger LOG = LoggerFactory.getLogger(AbortedTxnCleaner.class.getName()); + + public AbortedTxnCleaner(HiveConf conf, TxnStore txnHandler, + MetadataCache metadataCache, boolean metricsEnabled, + FSRemover fsRemover) { +super(conf, txnHandler, metadataCache, metricsEnabled, fsRemover); + } + + /** + The following cleanup is based on the following idea - + 1. Aborted cleanup is independent of compaction. This is because directories which are written by + aborted txns are not visible by any open txns. It is only visible while determining the AcidState (which + only sees the aborted deltas and does not read the file). + + The following algorithm is used to clean the set of aborted directories - + a. Find the list of entries which are suitable for cleanup (This is done in {@link TxnStore#findReadyToCleanForAborts(long, int)}). + b. If the table/partition does not exist, then remove the associated aborted entry in TXN_COMPONENTS table. + c. Get the AcidState of the table by using the min open txnID, database name, tableName, partition name, highest write ID + d. Fetch the aborted directories and delete the directories. + e. Fetch the aborted write IDs from the AcidState and use it to delete the associated metadata in the TXN_COMPONENTS table. + **/ + @Override + public List getTasks() throws MetaException { +int abortedThreshold = HiveConf.getIntVar(conf, + HiveConf.ConfVars.HIVE_COMPACTOR_ABORTEDTXN_THRESHOLD); +long abortedTimeThreshold = HiveConf + .getTimeVar(conf, HiveConf.ConfVars.HIVE_COMPACTOR_ABORTEDTXN_TIME_THRESHOLD, + TimeUnit.MILLISECONDS); +List readyToCleanAborts = txnHandler.findReadyToCleanForAborts(abortedTimeThreshold, abortedThreshold); + +if (!readyToCleanAborts.isEmpty()) { + return readyToCleanAborts.stream().map(ci -> ThrowingRunnable.unchecked(() -> + clean(ci, ci.txnId > 0 ? ci.txnId : Long.MAX_
[jira] [Work logged] (HIVE-27020) Implement a separate handler to handle aborted transaction cleanup
[ https://issues.apache.org/jira/browse/HIVE-27020?focusedWorklogId=851984&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851984 ] ASF GitHub Bot logged work on HIVE-27020: - Author: ASF GitHub Bot Created on: 21/Mar/23 11:39 Start Date: 21/Mar/23 11:39 Worklog Time Spent: 10m Work Description: akshat0395 commented on code in PR #4091: URL: https://github.com/apache/hive/pull/4091#discussion_r1143245882 ## ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/handler/AbortedTxnCleaner.java: ## @@ -0,0 +1,168 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hive.ql.txn.compactor.handler; + +import org.apache.hadoop.hive.common.ValidReaderWriteIdList; +import org.apache.hadoop.hive.common.ValidTxnList; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.metastore.api.MetaException; +import org.apache.hadoop.hive.metastore.api.Partition; +import org.apache.hadoop.hive.metastore.api.Table; +import org.apache.hadoop.hive.metastore.metrics.MetricsConstants; +import org.apache.hadoop.hive.metastore.metrics.PerfLogger; +import org.apache.hadoop.hive.metastore.txn.AcidTxnInfo; +import org.apache.hadoop.hive.metastore.txn.TxnStore; +import org.apache.hadoop.hive.metastore.txn.TxnUtils; +import org.apache.hadoop.hive.metastore.utils.MetaStoreUtils; +import org.apache.hadoop.hive.ql.txn.compactor.CompactorUtil; +import org.apache.hadoop.hive.ql.txn.compactor.CompactorUtil.ThrowingRunnable; +import org.apache.hadoop.hive.ql.txn.compactor.FSRemover; +import org.apache.hadoop.hive.ql.txn.compactor.MetadataCache; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.Collections; +import java.util.List; +import java.util.concurrent.TimeUnit; +import java.util.stream.Collectors; + +import static java.util.Objects.isNull; + +/** + * Abort-cleanup based implementation of TaskHandler. + * Provides implementation of creation of abort clean tasks. + */ +class AbortedTxnCleaner extends AcidTxnCleaner { + + private static final Logger LOG = LoggerFactory.getLogger(AbortedTxnCleaner.class.getName()); + + public AbortedTxnCleaner(HiveConf conf, TxnStore txnHandler, + MetadataCache metadataCache, boolean metricsEnabled, + FSRemover fsRemover) { +super(conf, txnHandler, metadataCache, metricsEnabled, fsRemover); + } + + /** + The following cleanup is based on the following idea - + 1. Aborted cleanup is independent of compaction. This is because directories which are written by + aborted txns are not visible by any open txns. It is only visible while determining the AcidState (which + only sees the aborted deltas and does not read the file). + + The following algorithm is used to clean the set of aborted directories - + a. Find the list of entries which are suitable for cleanup (This is done in {@link TxnStore#findReadyToCleanForAborts(long, int)}). + b. If the table/partition does not exist, then remove the associated aborted entry in TXN_COMPONENTS table. + c. Get the AcidState of the table by using the min open txnID, database name, tableName, partition name, highest write ID + d. Fetch the aborted directories and delete the directories. + e. Fetch the aborted write IDs from the AcidState and use it to delete the associated metadata in the TXN_COMPONENTS table. + **/ + @Override + public List getTasks() throws MetaException { +int abortedThreshold = HiveConf.getIntVar(conf, + HiveConf.ConfVars.HIVE_COMPACTOR_ABORTEDTXN_THRESHOLD); +long abortedTimeThreshold = HiveConf + .getTimeVar(conf, HiveConf.ConfVars.HIVE_COMPACTOR_ABORTEDTXN_TIME_THRESHOLD, + TimeUnit.MILLISECONDS); +List readyToCleanAborts = txnHandler.findReadyToCleanForAborts(abortedTimeThreshold, abortedThreshold); + +if (!readyToCleanAborts.isEmpty()) { + return readyToCleanAborts.stream().map(ci -> ThrowingRunnable.unchecked(() -> + clean(ci, ci.txnId > 0 ? ci.txnId : Long.MAX_VAL
[jira] [Work logged] (HIVE-27097) Improve the retry strategy for Metastore client and server
[ https://issues.apache.org/jira/browse/HIVE-27097?focusedWorklogId=851975&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851975 ] ASF GitHub Bot logged work on HIVE-27097: - Author: ASF GitHub Bot Created on: 21/Mar/23 10:28 Start Date: 21/Mar/23 10:28 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #4076: URL: https://github.com/apache/hive/pull/4076#issuecomment-1477594791 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=4076) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4076&resolved=false&types=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4076&resolved=false&types=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4076&resolved=false&types=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4076&resolved=false&types=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4076&resolved=false&types=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4076&resolved=false&types=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4076&resolved=false&types=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4076&resolved=false&types=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4076&resolved=false&types=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4076&resolved=false&types=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4076&resolved=false&types=CODE_SMELL) [1 Code Smell](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4076&resolved=false&types=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=4076&metric=coverage&view=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=4076&metric=duplicated_lines_density&view=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 851975) Time Spent: 4h 40m (was: 4.5h) > Improve the retry strategy for Metastore client and server > -- > > Key: HIVE-27097 > URL: https://issues.apache.org/jira/browse/HIVE-27097 > Project: Hive > Issue Type: Improvement > Components: Hive >Affects Versions: 4.0.0-alpha-2 >Reporter: Wechar >Assignee: Wechar >Priority: Major > Labels: pull-request-available > Time Spent: 4h 40m > Remaining Estimate: 0h > > *Background* > Hive provides *{{RetryingMetaStoreClient}}* and *{{RetryingHMSHandler}}* to > do retry when thrift request failed: > * RetryingMetaStoreClient will retry for *thrift related exception* and some > *MetaException* > * RetryingHMSHandler will retry for all {*}JDOException{*} or > *NucleusException*. > *Motivation* > Current retry mechanism will lead to many unnecessary retries in both client > and server. To simplify the process, we introduce following retry mechanism: > * Client side only concerns the error of communica
[jira] [Resolved] (HIVE-26091) Support DecimalFilterPredicateLeafBuilder for parquet
[ https://issues.apache.org/jira/browse/HIVE-26091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan resolved HIVE-26091. - Resolution: Duplicate Closing this as a dup of HIVE-27159. (HIVE-27159 has more info). > Support DecimalFilterPredicateLeafBuilder for parquet > - > > Key: HIVE-26091 > URL: https://issues.apache.org/jira/browse/HIVE-26091 > Project: Hive > Issue Type: Bug >Reporter: Rajesh Balamohan >Priority: Major > > > https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java#L41 > It will nice to have DecimalFilterPredicateLeafBuilder. This will help in > supporting SARG pushdowns. > {noformat} > 2022-03-30 08:59:50,040 [ERROR] [TezChild] > |read.ParquetFilterPredicateConverter|: fail to build predicate filter leaf > with errorsorg.apache.hadoop.hive.ql.metadata.HiveException: Conversion to > Parquet FilterPredicate not supported for DECIMAL > org.apache.hadoop.hive.ql.metadata.HiveException: Conversion to Parquet > FilterPredicate not supported for DECIMAL > at > org.apache.hadoop.hive.ql.io.parquet.LeafFilterFactory.getLeafFilterBuilderByType(LeafFilterFactory.java:223) > at > org.apache.hadoop.hive.ql.io.parquet.read.ParquetFilterPredicateConverter.buildFilterPredicateFromPredicateLeaf(ParquetFilterPredicateConverter.java:130) > at > org.apache.hadoop.hive.ql.io.parquet.read.ParquetFilterPredicateConverter.translate(ParquetFilterPredicateConverter.java:111) > at > org.apache.hadoop.hive.ql.io.parquet.read.ParquetFilterPredicateConverter.translate(ParquetFilterPredicateConverter.java:97) > at > org.apache.hadoop.hive.ql.io.parquet.read.ParquetFilterPredicateConverter.translate(ParquetFilterPredicateConverter.java:71) > at > org.apache.hadoop.hive.ql.io.parquet.read.ParquetFilterPredicateConverter.translate(ParquetFilterPredicateConverter.java:88) > at > org.apache.hadoop.hive.ql.io.parquet.read.ParquetFilterPredicateConverter.toFilterPredicate(ParquetFilterPredicateConverter.java:57) > at > org.apache.hadoop.hive.ql.io.parquet.ParquetRecordReaderBase.setFilter(ParquetRecordReaderBase.java:184) > at > org.apache.hadoop.hive.ql.io.parquet.ParquetRecordReaderBase.getSplit(ParquetRecordReaderBase.java:124) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.(VectorizedParquetRecordReader.java:158) > at > org.apache.hadoop.hive.ql.io.parquet.VectorizedParquetInputFormat.getRecordReader(VectorizedParquetInputFormat.java:50) > at > org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:87) > at > org.apache.hadoop.hive.ql.io.RecordReaderWrapper.create(RecordReaderWrapper.java:72) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:429) > at > org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:203) > at > org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:152) > at > org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:437) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:282) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:265) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:75) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:62) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:62) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:38) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at > com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125) > at > com.google.common.util.concurrent.InterruptibleTask.run(
[jira] [Work logged] (HIVE-26400) Provide docker images for Hive
[ https://issues.apache.org/jira/browse/HIVE-26400?focusedWorklogId=851964&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851964 ] ASF GitHub Bot logged work on HIVE-26400: - Author: ASF GitHub Bot Created on: 21/Mar/23 09:12 Start Date: 21/Mar/23 09:12 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #4133: URL: https://github.com/apache/hive/pull/4133#issuecomment-1477493508 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=4133) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4133&resolved=false&types=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4133&resolved=false&types=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4133&resolved=false&types=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4133&resolved=false&types=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4133&resolved=false&types=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4133&resolved=false&types=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4133&resolved=false&types=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4133&resolved=false&types=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4133&resolved=false&types=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4133&resolved=false&types=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4133&resolved=false&types=CODE_SMELL) [0 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4133&resolved=false&types=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=4133&metric=coverage&view=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=4133&metric=duplicated_lines_density&view=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 851964) Time Spent: 8.5h (was: 8h 20m) > Provide docker images for Hive > -- > > Key: HIVE-26400 > URL: https://issues.apache.org/jira/browse/HIVE-26400 > Project: Hive > Issue Type: Sub-task > Components: Build Infrastructure >Reporter: Zhihua Deng >Assignee: Zhihua Deng >Priority: Blocker > Labels: hive-4.0.0-must, pull-request-available > Time Spent: 8.5h > Remaining Estimate: 0h > > Make Apache Hive be able to run inside docker container in pseudo-distributed > mode, with MySQL/Derby as its back database, provide the following: > * Quick-start/Debugging/Prepare a test env for Hive; > * Tools to build target image with specified version of Hive and its > dependencies; > * Images can be used as the basis for the Kubernetes operator. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (HIVE-27160) Iceberg: Optimise delete (entire) data from table
[ https://issues.apache.org/jira/browse/HIVE-27160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Denys Kuzmenko resolved HIVE-27160. --- Fix Version/s: 4.0.0 Resolution: Fixed > Iceberg: Optimise delete (entire) data from table > - > > Key: HIVE-27160 > URL: https://issues.apache.org/jira/browse/HIVE-27160 > Project: Hive > Issue Type: Task >Reporter: Denys Kuzmenko >Assignee: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 10m > Remaining Estimate: 0h > > Currently, in MOR mode, Hive creates "positional delete" files during > deletes. With "Delete from ", the entire dataset in the table or partition is > written as a "positional delete" file. > During the read operation, all these files are read again causing huge delay. > Proposal: apply "truncate" optimization in case of "delete *". -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HIVE-27160) Iceberg: Optimise delete (entire) data from table
[ https://issues.apache.org/jira/browse/HIVE-27160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17703111#comment-17703111 ] Denys Kuzmenko commented on HIVE-27160: --- Merged to master. Thanks [~kkasa] for the review! > Iceberg: Optimise delete (entire) data from table > - > > Key: HIVE-27160 > URL: https://issues.apache.org/jira/browse/HIVE-27160 > Project: Hive > Issue Type: Task >Reporter: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Currently, in MOR mode, Hive creates "positional delete" files during > deletes. With "Delete from ", the entire dataset in the table or partition is > written as a "positional delete" file. > During the read operation, all these files are read again causing huge delay. > Proposal: apply "truncate" optimization in case of "delete *". -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HIVE-27160) Iceberg: Optimise delete (entire) data from table
[ https://issues.apache.org/jira/browse/HIVE-27160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Denys Kuzmenko reassigned HIVE-27160: - Assignee: Denys Kuzmenko > Iceberg: Optimise delete (entire) data from table > - > > Key: HIVE-27160 > URL: https://issues.apache.org/jira/browse/HIVE-27160 > Project: Hive > Issue Type: Task >Reporter: Denys Kuzmenko >Assignee: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Currently, in MOR mode, Hive creates "positional delete" files during > deletes. With "Delete from ", the entire dataset in the table or partition is > written as a "positional delete" file. > During the read operation, all these files are read again causing huge delay. > Proposal: apply "truncate" optimization in case of "delete *". -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-27160) Iceberg: Optimise delete (entire) data from table
[ https://issues.apache.org/jira/browse/HIVE-27160?focusedWorklogId=851961&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851961 ] ASF GitHub Bot logged work on HIVE-27160: - Author: ASF GitHub Bot Created on: 21/Mar/23 09:01 Start Date: 21/Mar/23 09:01 Worklog Time Spent: 10m Work Description: deniskuzZ merged PR #4109: URL: https://github.com/apache/hive/pull/4109 Issue Time Tracking --- Worklog Id: (was: 851961) Remaining Estimate: 0h Time Spent: 10m > Iceberg: Optimise delete (entire) data from table > - > > Key: HIVE-27160 > URL: https://issues.apache.org/jira/browse/HIVE-27160 > Project: Hive > Issue Type: Task >Reporter: Denys Kuzmenko >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Currently, in MOR mode, Hive creates "positional delete" files during > deletes. With "Delete from ", the entire dataset in the table or partition is > written as a "positional delete" file. > During the read operation, all these files are read again causing huge delay. > Proposal: apply "truncate" optimization in case of "delete *". -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-27160) Iceberg: Optimise delete (entire) data from table
[ https://issues.apache.org/jira/browse/HIVE-27160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-27160: -- Labels: pull-request-available (was: ) > Iceberg: Optimise delete (entire) data from table > - > > Key: HIVE-27160 > URL: https://issues.apache.org/jira/browse/HIVE-27160 > Project: Hive > Issue Type: Task >Reporter: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Currently, in MOR mode, Hive creates "positional delete" files during > deletes. With "Delete from ", the entire dataset in the table or partition is > written as a "positional delete" file. > During the read operation, all these files are read again causing huge delay. > Proposal: apply "truncate" optimization in case of "delete *". -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-27020) Implement a separate handler to handle aborted transaction cleanup
[ https://issues.apache.org/jira/browse/HIVE-27020?focusedWorklogId=851952&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851952 ] ASF GitHub Bot logged work on HIVE-27020: - Author: ASF GitHub Bot Created on: 21/Mar/23 08:26 Start Date: 21/Mar/23 08:26 Worklog Time Spent: 10m Work Description: veghlaci05 commented on code in PR #4091: URL: https://github.com/apache/hive/pull/4091#discussion_r1143022684 ## ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/handler/AbortedTxnCleaner.java: ## @@ -0,0 +1,168 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hive.ql.txn.compactor.handler; + +import org.apache.hadoop.hive.common.ValidReaderWriteIdList; +import org.apache.hadoop.hive.common.ValidTxnList; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.metastore.api.MetaException; +import org.apache.hadoop.hive.metastore.api.Partition; +import org.apache.hadoop.hive.metastore.api.Table; +import org.apache.hadoop.hive.metastore.metrics.MetricsConstants; +import org.apache.hadoop.hive.metastore.metrics.PerfLogger; +import org.apache.hadoop.hive.metastore.txn.AcidTxnInfo; +import org.apache.hadoop.hive.metastore.txn.TxnStore; +import org.apache.hadoop.hive.metastore.txn.TxnUtils; +import org.apache.hadoop.hive.metastore.utils.MetaStoreUtils; +import org.apache.hadoop.hive.ql.txn.compactor.CompactorUtil; +import org.apache.hadoop.hive.ql.txn.compactor.CompactorUtil.ThrowingRunnable; +import org.apache.hadoop.hive.ql.txn.compactor.FSRemover; +import org.apache.hadoop.hive.ql.txn.compactor.MetadataCache; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.Collections; +import java.util.List; +import java.util.concurrent.TimeUnit; +import java.util.stream.Collectors; + +import static java.util.Objects.isNull; + +/** + * Abort-cleanup based implementation of TaskHandler. + * Provides implementation of creation of abort clean tasks. + */ +class AbortedTxnCleaner extends AcidTxnCleaner { + + private static final Logger LOG = LoggerFactory.getLogger(AbortedTxnCleaner.class.getName()); + + public AbortedTxnCleaner(HiveConf conf, TxnStore txnHandler, + MetadataCache metadataCache, boolean metricsEnabled, + FSRemover fsRemover) { +super(conf, txnHandler, metadataCache, metricsEnabled, fsRemover); + } + + /** + The following cleanup is based on the following idea - + 1. Aborted cleanup is independent of compaction. This is because directories which are written by + aborted txns are not visible by any open txns. It is only visible while determining the AcidState (which + only sees the aborted deltas and does not read the file). + + The following algorithm is used to clean the set of aborted directories - + a. Find the list of entries which are suitable for cleanup (This is done in {@link TxnStore#findReadyToCleanForAborts(long, int)}). + b. If the table/partition does not exist, then remove the associated aborted entry in TXN_COMPONENTS table. + c. Get the AcidState of the table by using the min open txnID, database name, tableName, partition name, highest write ID + d. Fetch the aborted directories and delete the directories. + e. Fetch the aborted write IDs from the AcidState and use it to delete the associated metadata in the TXN_COMPONENTS table. + **/ + @Override + public List getTasks() throws MetaException { +int abortedThreshold = HiveConf.getIntVar(conf, + HiveConf.ConfVars.HIVE_COMPACTOR_ABORTEDTXN_THRESHOLD); +long abortedTimeThreshold = HiveConf + .getTimeVar(conf, HiveConf.ConfVars.HIVE_COMPACTOR_ABORTEDTXN_TIME_THRESHOLD, + TimeUnit.MILLISECONDS); +List readyToCleanAborts = txnHandler.findReadyToCleanForAborts(abortedTimeThreshold, abortedThreshold); + +if (!readyToCleanAborts.isEmpty()) { + return readyToCleanAborts.stream().map(ci -> ThrowingRunnable.unchecked(() -> + clean(ci, ci.txnId > 0 ? ci.txnId : Long.MAX_VAL
[jira] [Work logged] (HIVE-27020) Implement a separate handler to handle aborted transaction cleanup
[ https://issues.apache.org/jira/browse/HIVE-27020?focusedWorklogId=851951&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851951 ] ASF GitHub Bot logged work on HIVE-27020: - Author: ASF GitHub Bot Created on: 21/Mar/23 08:26 Start Date: 21/Mar/23 08:26 Worklog Time Spent: 10m Work Description: veghlaci05 commented on code in PR #4091: URL: https://github.com/apache/hive/pull/4091#discussion_r1143022684 ## ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/handler/AbortedTxnCleaner.java: ## @@ -0,0 +1,168 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hive.ql.txn.compactor.handler; + +import org.apache.hadoop.hive.common.ValidReaderWriteIdList; +import org.apache.hadoop.hive.common.ValidTxnList; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.metastore.api.MetaException; +import org.apache.hadoop.hive.metastore.api.Partition; +import org.apache.hadoop.hive.metastore.api.Table; +import org.apache.hadoop.hive.metastore.metrics.MetricsConstants; +import org.apache.hadoop.hive.metastore.metrics.PerfLogger; +import org.apache.hadoop.hive.metastore.txn.AcidTxnInfo; +import org.apache.hadoop.hive.metastore.txn.TxnStore; +import org.apache.hadoop.hive.metastore.txn.TxnUtils; +import org.apache.hadoop.hive.metastore.utils.MetaStoreUtils; +import org.apache.hadoop.hive.ql.txn.compactor.CompactorUtil; +import org.apache.hadoop.hive.ql.txn.compactor.CompactorUtil.ThrowingRunnable; +import org.apache.hadoop.hive.ql.txn.compactor.FSRemover; +import org.apache.hadoop.hive.ql.txn.compactor.MetadataCache; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.Collections; +import java.util.List; +import java.util.concurrent.TimeUnit; +import java.util.stream.Collectors; + +import static java.util.Objects.isNull; + +/** + * Abort-cleanup based implementation of TaskHandler. + * Provides implementation of creation of abort clean tasks. + */ +class AbortedTxnCleaner extends AcidTxnCleaner { + + private static final Logger LOG = LoggerFactory.getLogger(AbortedTxnCleaner.class.getName()); + + public AbortedTxnCleaner(HiveConf conf, TxnStore txnHandler, + MetadataCache metadataCache, boolean metricsEnabled, + FSRemover fsRemover) { +super(conf, txnHandler, metadataCache, metricsEnabled, fsRemover); + } + + /** + The following cleanup is based on the following idea - + 1. Aborted cleanup is independent of compaction. This is because directories which are written by + aborted txns are not visible by any open txns. It is only visible while determining the AcidState (which + only sees the aborted deltas and does not read the file). + + The following algorithm is used to clean the set of aborted directories - + a. Find the list of entries which are suitable for cleanup (This is done in {@link TxnStore#findReadyToCleanForAborts(long, int)}). + b. If the table/partition does not exist, then remove the associated aborted entry in TXN_COMPONENTS table. + c. Get the AcidState of the table by using the min open txnID, database name, tableName, partition name, highest write ID + d. Fetch the aborted directories and delete the directories. + e. Fetch the aborted write IDs from the AcidState and use it to delete the associated metadata in the TXN_COMPONENTS table. + **/ + @Override + public List getTasks() throws MetaException { +int abortedThreshold = HiveConf.getIntVar(conf, + HiveConf.ConfVars.HIVE_COMPACTOR_ABORTEDTXN_THRESHOLD); +long abortedTimeThreshold = HiveConf + .getTimeVar(conf, HiveConf.ConfVars.HIVE_COMPACTOR_ABORTEDTXN_TIME_THRESHOLD, + TimeUnit.MILLISECONDS); +List readyToCleanAborts = txnHandler.findReadyToCleanForAborts(abortedTimeThreshold, abortedThreshold); + +if (!readyToCleanAborts.isEmpty()) { + return readyToCleanAborts.stream().map(ci -> ThrowingRunnable.unchecked(() -> + clean(ci, ci.txnId > 0 ? ci.txnId : Long.MAX_VAL
[jira] [Commented] (HIVE-27159) Filters are not pushed down for decimal format in Parquet
[ https://issues.apache.org/jira/browse/HIVE-27159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17703090#comment-17703090 ] Stamatis Zampetakis commented on HIVE-27159: This looks like a duplicate of HIVE-26091; let's close one of them. > Filters are not pushed down for decimal format in Parquet > - > > Key: HIVE-27159 > URL: https://issues.apache.org/jira/browse/HIVE-27159 > Project: Hive > Issue Type: Improvement >Reporter: Rajesh Balamohan >Priority: Major > Labels: performance > > Decimal filters are not created and pushed down in parquet readers. This > causes latency delays and unwanted row processing in query execution. > It throws exception in runtime and processes more rows. > E.g Q13. > {noformat} > Parquet: (Map 1) > INFO : Task Execution Summary > INFO : > -- > INFO : VERTICES DURATION(ms) CPU_TIME(ms)GC_TIME(ms) > INPUT_RECORDS OUTPUT_RECORDS > INFO : > -- > INFO : Map 1 31254.00 0 0 > 549,181,950 133 > INFO : Map 3 0.00 0 0 > 73,049 365 > INFO : Map 4 2027.00 0 0 > 6,000,0001,689,919 > INFO : Map 5 0.00 0 0 > 7,2001,440 > INFO : Map 6517.00 0 0 > 1,920,800 493,920 > INFO : Map 7 0.00 0 0 > 1,0021,002 > INFO : Reducer 2 18716.00 0 0 > 1330 > INFO : > -- > ORC: > INFO : Task Execution Summary > INFO : > -- > INFO : VERTICES DURATION(ms) CPU_TIME(ms)GC_TIME(ms) > INPUT_RECORDS OUTPUT_RECORDS > INFO : > -- > INFO : Map 1 6556.00 0 0 > 267,146,063 152 > INFO : Map 3 0.00 0 0 > 10,000 365 > INFO : Map 4 2014.00 0 0 > 6,000,0001,689,919 > INFO : Map 5 0.00 0 0 > 7,2001,440 > INFO : Map 6504.00 0 0 > 1,920,800 493,920 > INFO : Reducer 2 3159.00 0 0 > 1520 > INFO : > -- > {noformat} > {noformat} > Map 1 > Map Operator Tree: > TableScan > alias: store_sales > filterExpr: (ss_hdemo_sk is not null and ss_addr_sk is not > null and ss_cdemo_sk is not null and ss_store_sk is not null and > ((ss_sales_price >= 100) or (ss_sales_price <= 150) or (ss_sales_price >= 50) > or (ss_sales_price <= 100) or (ss_sales_price >= 150) or (ss_sales_price <= > 200)) and ((ss_net_profit >= 100) or (ss_net_profit <= 200) or (ss_net_profit > >= 150) or (ss_net_profit <= 300) or (ss_net_profit >= 50) or (ss_net_profit > <= 250))) (type: boolean) > probeDecodeDetails: > cacheKey:HASH_MAP_MAPJOIN_112_container, bigKeyColName:ss_hdemo_sk, > smallTablePos:1, keyRatio:5.042575832290721E-6 > Statistics: Num rows: 2750380056 Data size: 1321831086472 > Basic stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: (ss_hdemo_sk is not null and ss_addr_sk is not > null and ss_cdemo_sk is not null and ss_store_sk is not null and > ((ss_sales_price >= 100) or (ss_sales_price <= 150) or (ss_sales_price >= 50) > or (ss_sales_price <= 100) or (ss_sales_price >= 150) or (ss_sales_price <= > 200)) and ((ss_net_profit >= 100) or (ss_net_profit <= 200) or (ss_net_profit > >= 150) or (ss_net_profit <= 300) or (ss_net_profit >= 50) or (ss_net_profit > <= 250))) (type: boolean) > Statistics: Num rows: 2500252205 Data size: 1201619783884 > Basic stats: COMPLETE Column stats: COMPLETE > Select Operator > expressions: ss_cdemo_sk (type: bigint), ss_hdemo_sk > (type: bigint), ss_a
[jira] [Work logged] (HIVE-26400) Provide docker images for Hive
[ https://issues.apache.org/jira/browse/HIVE-26400?focusedWorklogId=851949&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851949 ] ASF GitHub Bot logged work on HIVE-26400: - Author: ASF GitHub Bot Created on: 21/Mar/23 08:15 Start Date: 21/Mar/23 08:15 Worklog Time Spent: 10m Work Description: dengzhhu653 opened a new pull request, #4133: URL: https://github.com/apache/hive/pull/4133 ### What changes were proposed in this pull request? The original PR #3448 is blocked by long running tests, open a new one. ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? Issue Time Tracking --- Worklog Id: (was: 851949) Time Spent: 8h 20m (was: 8h 10m) > Provide docker images for Hive > -- > > Key: HIVE-26400 > URL: https://issues.apache.org/jira/browse/HIVE-26400 > Project: Hive > Issue Type: Sub-task > Components: Build Infrastructure >Reporter: Zhihua Deng >Assignee: Zhihua Deng >Priority: Blocker > Labels: hive-4.0.0-must, pull-request-available > Time Spent: 8h 20m > Remaining Estimate: 0h > > Make Apache Hive be able to run inside docker container in pseudo-distributed > mode, with MySQL/Derby as its back database, provide the following: > * Quick-start/Debugging/Prepare a test env for Hive; > * Tools to build target image with specified version of Hive and its > dependencies; > * Images can be used as the basis for the Kubernetes operator. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-27147) HS2 is not accessible to clients via zookeeper when hostname used is not FQDN
[ https://issues.apache.org/jira/browse/HIVE-27147?focusedWorklogId=851948&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851948 ] ASF GitHub Bot logged work on HIVE-27147: - Author: ASF GitHub Bot Created on: 21/Mar/23 08:05 Start Date: 21/Mar/23 08:05 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #4130: URL: https://github.com/apache/hive/pull/4130#issuecomment-1477414324 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=4130) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4130&resolved=false&types=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4130&resolved=false&types=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4130&resolved=false&types=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4130&resolved=false&types=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4130&resolved=false&types=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4130&resolved=false&types=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4130&resolved=false&types=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4130&resolved=false&types=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4130&resolved=false&types=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4130&resolved=false&types=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4130&resolved=false&types=CODE_SMELL) [0 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4130&resolved=false&types=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=4130&metric=coverage&view=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=4130&metric=duplicated_lines_density&view=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 851948) Time Spent: 40m (was: 0.5h) > HS2 is not accessible to clients via zookeeper when hostname used is not FQDN > - > > Key: HIVE-27147 > URL: https://issues.apache.org/jira/browse/HIVE-27147 > Project: Hive > Issue Type: Bug >Reporter: Venugopal Reddy K >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > HS2 is not accessible to clients via zookeeper when hostname used during > registration is InetAddress.getHostName() with JDK 11. This issue is > happening due to change in behavior on JDK 11 and it is OS specific - > [https://stackoverflow.com/questions/61898627/inetaddress-getlocalhost-gethostname-different-behavior-between-jdk-11-and-j|http://example.com/] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26400) Provide docker images for Hive
[ https://issues.apache.org/jira/browse/HIVE-26400?focusedWorklogId=851947&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851947 ] ASF GitHub Bot logged work on HIVE-26400: - Author: ASF GitHub Bot Created on: 21/Mar/23 08:00 Start Date: 21/Mar/23 08:00 Worklog Time Spent: 10m Work Description: dengzhhu653 commented on code in PR #3448: URL: https://github.com/apache/hive/pull/3448#discussion_r1142999778 ## packaging/src/docker/Dockerfile: ## @@ -0,0 +1,54 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +#http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + +FROM openjdk:8-jre + +ARG HADOOP_VERSION +ARG HIVE_VERSION +ARG TEZ_VERSION + +# Install dependencies +RUN set -ex; \ + apt-get update; \ + apt-get -y install curl; \ + apt-get -y install procps; \ + rm -rf /var/lib/apt/lists/* + +COPY hadoop-$HADOOP_VERSION.tar.gz /opt +COPY apache-hive-$HIVE_VERSION-bin.tar.gz /opt +COPY apache-tez-$TEZ_VERSION-bin.tar.gz /opt + +RUN tar -xzvf /opt/hadoop-$HADOOP_VERSION.tar.gz -C /opt/ && \ Review Comment: Not fully understand, but will leave it open for further improvements. Issue Time Tracking --- Worklog Id: (was: 851947) Time Spent: 8h 10m (was: 8h) > Provide docker images for Hive > -- > > Key: HIVE-26400 > URL: https://issues.apache.org/jira/browse/HIVE-26400 > Project: Hive > Issue Type: Sub-task > Components: Build Infrastructure >Reporter: Zhihua Deng >Assignee: Zhihua Deng >Priority: Blocker > Labels: hive-4.0.0-must, pull-request-available > Time Spent: 8h 10m > Remaining Estimate: 0h > > Make Apache Hive be able to run inside docker container in pseudo-distributed > mode, with MySQL/Derby as its back database, provide the following: > * Quick-start/Debugging/Prepare a test env for Hive; > * Tools to build target image with specified version of Hive and its > dependencies; > * Images can be used as the basis for the Kubernetes operator. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-27135) Cleaner fails with FileNotFoundException
[ https://issues.apache.org/jira/browse/HIVE-27135?focusedWorklogId=851945&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851945 ] ASF GitHub Bot logged work on HIVE-27135: - Author: ASF GitHub Bot Created on: 21/Mar/23 07:57 Start Date: 21/Mar/23 07:57 Worklog Time Spent: 10m Work Description: deniskuzZ commented on code in PR #4114: URL: https://github.com/apache/hive/pull/4114#discussion_r1142997113 ## ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java: ## @@ -1538,32 +1538,36 @@ private static HdfsDirSnapshot addToSnapshot(Map dirToSna public static Map getHdfsDirSnapshots(final FileSystem fs, final Path path) throws IOException { Map dirToSnapshots = new HashMap<>(); -RemoteIterator itr = FileUtils.listFiles(fs, path, true, acidHiddenFileFilter); -while (itr.hasNext()) { - FileStatus fStatus = itr.next(); - Path fPath = fStatus.getPath(); - if (fStatus.isDirectory() && acidTempDirFilter.accept(fPath)) { -addToSnapshot(dirToSnapshots, fPath); - } else { -Path parentDirPath = fPath.getParent(); -if (acidTempDirFilter.accept(parentDirPath)) { - while (isChildOfDelta(parentDirPath, path)) { -// Some cases there are other directory layers between the delta and the datafiles -// (export-import mm table, insert with union all to mm table, skewed tables). -// But it does not matter for the AcidState, we just need the deltas and the data files -// So build the snapshot with the files inside the delta directory -parentDirPath = parentDirPath.getParent(); - } - HdfsDirSnapshot dirSnapshot = addToSnapshot(dirToSnapshots, parentDirPath); - // We're not filtering out the metadata file and acid format file, - // as they represent parts of a valid snapshot - // We're not using the cached values downstream, but we can potentially optimize more in a follow-up task - if (fStatus.getPath().toString().contains(MetaDataFile.METADATA_FILE)) { -dirSnapshot.addMetadataFile(fStatus); - } else if (fStatus.getPath().toString().contains(OrcAcidVersion.ACID_FORMAT)) { -dirSnapshot.addOrcAcidFormatFile(fStatus); - } else { -dirSnapshot.addFile(fStatus); +Deque> stack = new ArrayDeque<>(); Review Comment: I don't really like this approach, you basically moved the logic from the recursive iterator into the addToSnapshot method. Issue Time Tracking --- Worklog Id: (was: 851945) Time Spent: 2h (was: 1h 50m) > Cleaner fails with FileNotFoundException > > > Key: HIVE-27135 > URL: https://issues.apache.org/jira/browse/HIVE-27135 > Project: Hive > Issue Type: Bug >Reporter: Dayakar M >Assignee: Dayakar M >Priority: Major > Labels: pull-request-available > Time Spent: 2h > Remaining Estimate: 0h > > The compaction fails when the Cleaner tried to remove a missing directory > from HDFS. > {code:java} > 2023-03-06 07:45:48,331 ERROR > org.apache.hadoop.hive.ql.txn.compactor.Cleaner: > [Cleaner-executor-thread-12]: Caught exception when cleaning, unable to > complete cleaning of > id:39762523,dbname:test,tableName:test_table,partName:null,state:,type:MINOR,enqueueTime:0,start:0,properties:null,runAs:hive,tooManyAborts:false,hasOldAbort:false,highestWriteId:989,errorMessage:null,workerId: > null,initiatorId: null java.io.FileNotFoundException: File > hdfs:/cluster/warehouse/tablespace/managed/hive/test.db/test_table/.hive-staging_hive_2023-03-06_07-45-23_120_4659605113266849995-73550 > does not exist. > at > org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1275) > at > org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1249) > at > org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1194) > at > org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1190) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.listLocatedStatus(DistributedFileSystem.java:1208) > at org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:2144) > at org.apache.hadoop.fs.FileSystem$5.handleFileStat(FileSystem.java:2332) > at org.apache.hadoop.fs.FileSystem$5.hasNext(FileSystem.java:2309) > at > org.apache.hadoop.util.functional.RemoteIterators$WrappingRemoteIterator.sourceHasNext(RemoteIterators.java:432) > at > org.apache.hadoop.util.functional.RemoteIterators$FilteringRemoteI