[jira] [Assigned] (HIVE-26955) Alter table change column data type of a Parquet table throws exception
[ https://issues.apache.org/jira/browse/HIVE-26955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sourabh Badhya reassigned HIVE-26955: - Assignee: Sourabh Badhya > Alter table change column data type of a Parquet table throws exception > --- > > Key: HIVE-26955 > URL: https://issues.apache.org/jira/browse/HIVE-26955 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Taraka Rama Rao Lethavadla >Assignee: Sourabh Badhya >Priority: Major > > Steps to reproduce > {noformat} > create table test_parquet (id decimal) stored as parquet; > insert into test_parquet values(238); > alter table test_parquet change id id string; > select * from test_parquet; > Error: java.io.IOException: org.apache.parquet.io.ParquetDecodingException: > Can not read value at 1 in block 0 in file > hdfs:/namenode:8020/warehouse/tablespace/managed/hive/test_parquet/delta_001_001_/00_0 > (state=,code=0) > at > org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:624) > at > org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:531) > at > org.apache.hadoop.hive.ql.exec.FetchTask.executeInner(FetchTask.java:194) > ... 55 more > Caused by: org.apache.parquet.io.ParquetDecodingException: Can not read value > at 1 in block 0 in file > file:/home/centos/Apache-Hive-Tarak/itests/qtest/target/localfs/warehouse/test_parquet/00_0 > at > org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:255) > at > org.apache.parquet.hadoop.ParquetRecordReader.nextKeyValue(ParquetRecordReader.java:207) > at > org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.(ParquetRecordReaderWrapper.java:87) > at > org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:89) > at > org.apache.hadoop.hive.ql.exec.FetchOperator$FetchInputFormatSplit.getRecordReader(FetchOperator.java:771) > at > org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:335) > at > org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:562) > ... 57 more > Caused by: java.lang.ClassCastException: > org.apache.hadoop.hive.serde2.typeinfo.PrimitiveTypeInfo cannot be cast to > org.apache.hadoop.hive.serde2.typeinfo.DecimalTypeInfo > at > org.apache.hadoop.hive.ql.io.parquet.convert.ETypeConverter$8$5.convert(ETypeConverter.java:669) > at > org.apache.hadoop.hive.ql.io.parquet.convert.ETypeConverter$8$5.convert(ETypeConverter.java:664) > at > org.apache.hadoop.hive.ql.io.parquet.convert.ETypeConverter$BinaryConverter.addBinary(ETypeConverter.java:977) > at > org.apache.parquet.column.impl.ColumnReaderBase$2$6.writeValue(ColumnReaderBase.java:360) > at > org.apache.parquet.column.impl.ColumnReaderBase.writeCurrentValueToConverter(ColumnReaderBase.java:410) > at > org.apache.parquet.column.impl.ColumnReaderImpl.writeCurrentValueToConverter(ColumnReaderImpl.java:30) > at > org.apache.parquet.io.RecordReaderImplementation.read(RecordReaderImplementation.java:406) > at > org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:230) > ... 63 more{noformat} > However the same is working as expected in ORC table > {noformat} > create table test_orc (id decimal) stored as orc; > insert into test_orc values(238); > alter table test_orc change id id string; > select * from test_orc; > +--+ > | test_orc.id | > +--+ > | 238 | > +--+{noformat} > As well as text table > {noformat} > create table test_text (id decimal) stored as textfile; > insert into test_text values(238); > alter table test_text change id id string; > select * from test_text; > +---+ > | test_text.id | > +---+ > | 238 | > +---+{noformat} > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26598) Fix unsetting of db params for optimized bootstrap when repl dump initiates data copy
[ https://issues.apache.org/jira/browse/HIVE-26598?focusedWorklogId=839845=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839845 ] ASF GitHub Bot logged work on HIVE-26598: - Author: ASF GitHub Bot Created on: 18/Jan/23 07:37 Start Date: 18/Jan/23 07:37 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #3780: URL: https://github.com/apache/hive/pull/3780#issuecomment-1386612591 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive=3780) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=3780=false=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3780=false=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive=3780=false=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=3780=false=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3780=false=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=3780=false=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3780=false=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3780=false=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3780=false=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive=3780=false=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3780=false=CODE_SMELL) [0 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive=3780=false=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive=3780=coverage=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive=3780=duplicated_lines_density=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 839845) Time Spent: 40m (was: 0.5h) > Fix unsetting of db params for optimized bootstrap when repl dump initiates > data copy > - > > Key: HIVE-26598 > URL: https://issues.apache.org/jira/browse/HIVE-26598 > Project: Hive > Issue Type: Bug >Reporter: Teddy Choi >Assignee: Rakshith C >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > when hive.repl.run.data.copy.tasks.on.target is set to false, repl dump task > will initiate the copy task from source cluster to staging directory. > In current code flow repl dump task dumps the metadata and then creates > another repl dump task with datacopyIterators initialized. > when the second dump cycle executes, it directly begins data copy tasks. > Because of this we don't enter second reverse dump flow and > unsetDbPropertiesForOptimisedBootstrap is never set to true again. > this results in db params (repl.target.for, repl.background.threads, etc) not > being unset. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HIVE-26959) CREATE TABLE with external.table.purge=false is ignored
[ https://issues.apache.org/jira/browse/HIVE-26959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17678103#comment-17678103 ] Ayush Saxena commented on HIVE-26959: - Should have given create external table, rather tham just create table, without external keyword it will consider like you are trying for managed table, which will get translated to external table with purge true, which takes precedence over the one specificied in the table properties > CREATE TABLE with external.table.purge=false is ignored > --- > > Key: HIVE-26959 > URL: https://issues.apache.org/jira/browse/HIVE-26959 > Project: Hive > Issue Type: Bug >Affects Versions: 4.0.0-alpha-2 >Reporter: Li Penglin >Priority: Major > > We set the default external.table.purge=true in > https://issues.apache.org/jira/browse/HIVE-26064, but this property is still > true when I set it to false. > > {code:java} > select version(); > ++ > | _c0 | > ++ > | 4.0.0-alpha-2 r36f5d91acb0fac00a5d46049bd45b744fe9aaab6 | > 0: jdbc:hive2://localhost:11050> create table test_parq_hive (i int) > > . . . . . . . . . . . . . . . .> stored as parquet > > . . . . . . . . . . . . . . . .> tblproperties > ('external.table.purge'='false'); > INFO : Compiling > command(queryId=root_20230118111342_c60c0f5f-3e2f-45af-a7ab-13f099b7ead9): > create table test_parq_hive (i int) > stored as parquet > > tblproperties ('external.table.purge'='false') > INFO : Semantic Analysis Completed (retrial = false) > 0: jdbc:hive2://localhost:11050> describe formatted test_parq_hive; > | | bucketing_version > | 2 | > | | external.table.purge > | TRUE | > > > | | transient_lastDdlTime > | 1674011622 | > {code} > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-22977) Merge delta files instead of running a query in major/minor compaction
[ https://issues.apache.org/jira/browse/HIVE-22977?focusedWorklogId=839844=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839844 ] ASF GitHub Bot logged work on HIVE-22977: - Author: ASF GitHub Bot Created on: 18/Jan/23 07:27 Start Date: 18/Jan/23 07:27 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #3801: URL: https://github.com/apache/hive/pull/3801#issuecomment-1386603262 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive=3801) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3801=false=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3801=false=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3801=false=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=CODE_SMELL) [2 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive=3801=coverage=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive=3801=duplicated_lines_density=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 839844) Time Spent: 4h 50m (was: 4h 40m) > Merge delta files instead of running a query in major/minor compaction > -- > > Key: HIVE-22977 > URL: https://issues.apache.org/jira/browse/HIVE-22977 > Project: Hive > Issue Type: Improvement >Reporter: László Pintér >Assignee: Sourabh Badhya >Priority: Major > Labels: pull-request-available > Attachments: HIVE-22977.01.patch, HIVE-22977.02.patch > > Time Spent: 4h 50m > Remaining Estimate: 0h > > [Compaction Optimiziation] > We should analyse the possibility to move a delta file instead of running a > major/minor compaction query. > Please consider the following use cases: > - full acid table but only insert queries were run. This means that no > delete delta directories were created. Is it possible to merge the delta > directory contents without running a compaction query? > - full acid table, initiating queries through the streaming API. If there > are no abort transactions during the streaming, is it possible to merge the > delta directory contents without running a compaction query? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-26962) Expose resume/reset ready state through replication metrics when first cycle of resume/reset completes
[ https://issues.apache.org/jira/browse/HIVE-26962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shreenidhi updated HIVE-26962: -- Description: As resume/reset workflow also follows optimised bootstrap, so here we have 2 cycles to mark this flow as complete. 1. 1st cycle will be triggered by orchestrator just when resume/reset action initiated. 2. now to initiate another cycle orchestrator needs to know if the first cycle got complete. To do this we need a mechanism in hive where it puts RESUME/RESET_READY state in replication metrics once the first cycle of RESUME/RESET completes. * Once orchestrator sees the RESET_READY state, it will trigger another cycle and does necessary work which needs to be done to complete RESET workflow. was: As resume/reset workflow also follows optimised bootstrap, so here we have 2 cycles to mark this flow as complete. 1. 1st cycle will be triggered by orchestrator just when resume/reset action initiated. 2. now to initiate another cycle orchestrator needs to know if the first cycle got complete. To do this we need a mechanism in hive where it puts RESUME/RESET_READY state in replication metrics once the first cycle of RESUME/RESET completes. * Once orchestrator sees the RESET_READY state, it will trigger the another cycle and does necessary work which needs to be done to complete RESET workflow. > Expose resume/reset ready state through replication metrics when first cycle > of resume/reset completes > -- > > Key: HIVE-26962 > URL: https://issues.apache.org/jira/browse/HIVE-26962 > Project: Hive > Issue Type: Task >Reporter: Shreenidhi >Assignee: Shreenidhi >Priority: Major > > As resume/reset workflow also follows optimised bootstrap, so here we have 2 > cycles to mark this flow as complete. > 1. 1st cycle will be triggered by orchestrator just when resume/reset action > initiated. > 2. now to initiate another cycle orchestrator needs to know if the first > cycle got complete. To do this we need a mechanism in hive where it puts > RESUME/RESET_READY state in replication metrics once the first cycle of > RESUME/RESET completes. > * Once orchestrator sees the RESET_READY state, it will trigger another > cycle and does necessary work which needs to be done to complete RESET > workflow. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HIVE-26962) Expose resume/reset ready state through replication metrics when first cycle of resume/reset completes
[ https://issues.apache.org/jira/browse/HIVE-26962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shreenidhi reassigned HIVE-26962: - Assignee: Shreenidhi > Expose resume/reset ready state through replication metrics when first cycle > of resume/reset completes > -- > > Key: HIVE-26962 > URL: https://issues.apache.org/jira/browse/HIVE-26962 > Project: Hive > Issue Type: Task >Reporter: Shreenidhi >Assignee: Shreenidhi >Priority: Major > > As resume/reset workflow also follows optimised bootstrap, so here we have 2 > cycles to mark this flow as complete. > 1. 1st cycle will be triggered by orchestrator just when resume/reset action > initiated. > 2. now to initiate another cycle orchestrator needs to know if the first > cycle got complete. To do this we need a mechanism in hive where it puts > RESUME/RESET_READY state in replication metrics once the first cycle of > RESUME/RESET completes. > * Once orchestrator sees the RESET_READY state, it will trigger the another > cycle and does necessary work which needs to be done to complete RESET > workflow. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-26959) CREATE TABLE with external.table.purge=false is ignored
[ https://issues.apache.org/jira/browse/HIVE-26959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Penglin updated HIVE-26959: -- Description: We set the default external.table.purge=true in https://issues.apache.org/jira/browse/HIVE-26064, but this property is still true when I set it to false. {code:java} select version(); ++ | _c0 | ++ | 4.0.0-alpha-2 r36f5d91acb0fac00a5d46049bd45b744fe9aaab6 | 0: jdbc:hive2://localhost:11050> create table test_parq_hive (i int) . . . . . . . . . . . . . . . .> stored as parquet . . . . . . . . . . . . . . . .> tblproperties ('external.table.purge'='false'); INFO : Compiling command(queryId=root_20230118111342_c60c0f5f-3e2f-45af-a7ab-13f099b7ead9): create table test_parq_hive (i int) stored as parquet tblproperties ('external.table.purge'='false') INFO : Semantic Analysis Completed (retrial = false) 0: jdbc:hive2://localhost:11050> describe formatted test_parq_hive; | | bucketing_version | 2 | | | external.table.purge | TRUE | | | transient_lastDdlTime | 1674011622 | {code} was: We set the default external.table.purge=true in https://issues.apache.org/jira/browse/HIVE-26064, but this property is still true when I set it to false. {code:java} 0: jdbc:hive2://localhost:11050> create table test_parq_hive (i int) . . . . . . . . . . . . . . . .> stored as parquet . . . . . . . . . . . . . . . .> tblproperties ('external.table.purge'='false'); INFO : Compiling command(queryId=root_20230118111342_c60c0f5f-3e2f-45af-a7ab-13f099b7ead9): create table test_parq_hive (i int) stored as parquet tblproperties ('external.table.purge'='false') INFO : Semantic Analysis Completed (retrial = false) 0: jdbc:hive2://localhost:11050> describe formatted test_parq_hive; | | bucketing_version | 2 | | | external.table.purge | TRUE | | | transient_lastDdlTime | 1674011622 | {code} > CREATE TABLE with external.table.purge=false is ignored > --- > > Key: HIVE-26959 > URL: https://issues.apache.org/jira/browse/HIVE-26959 > Project: Hive > Issue Type: Bug >Affects Versions: 4.0.0-alpha-2 >Reporter: Li Penglin >Priority: Major > > We set the default external.table.purge=true in > https://issues.apache.org/jira/browse/HIVE-26064, but this property is still > true when I set it to false. > > {code:java} > select version(); > ++ > | _c0 | > ++ > | 4.0.0-alpha-2 r36f5d91acb0fac00a5d46049bd45b744fe9aaab6 | > 0: jdbc:hive2://localhost:11050> create table test_parq_hive (i int) > > . . . . . . . . . . . . . . . .> stored as parquet > > . . . . . . . . . . . . . . . .> tblproperties > ('external.table.purge'='false'); > INFO : Compiling > command(queryId=root_20230118111342_c60c0f5f-3e2f-45af-a7ab-13f099b7ead9): > create table test_parq_hive (i int) > stored as parquet > > tblproperties ('external.table.purge'='false') > INFO : Semantic Analysis Completed (retrial = false) > 0: jdbc:hive2://localhost:11050>
[jira] [Updated] (HIVE-26959) CREATE TABLE with external.table.purge=false is ignored
[ https://issues.apache.org/jira/browse/HIVE-26959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Penglin updated HIVE-26959: -- Affects Version/s: 4.0.0-alpha-2 > CREATE TABLE with external.table.purge=false is ignored > --- > > Key: HIVE-26959 > URL: https://issues.apache.org/jira/browse/HIVE-26959 > Project: Hive > Issue Type: Bug >Affects Versions: 4.0.0-alpha-2 >Reporter: Li Penglin >Priority: Major > > We set the default external.table.purge=true in > https://issues.apache.org/jira/browse/HIVE-26064, but this property is still > true when I set it to false. > > {code:java} > 0: jdbc:hive2://localhost:11050> create table test_parq_hive (i int) > > . . . . . . . . . . . . . . . .> stored as parquet > > . . . . . . . . . . . . . . . .> tblproperties > ('external.table.purge'='false'); > INFO : Compiling > command(queryId=root_20230118111342_c60c0f5f-3e2f-45af-a7ab-13f099b7ead9): > create table test_parq_hive (i int) > stored as parquet > > tblproperties ('external.table.purge'='false') > INFO : Semantic Analysis Completed (retrial = false) > 0: jdbc:hive2://localhost:11050> describe formatted test_parq_hive; > | | bucketing_version > | 2 | > | | external.table.purge > | TRUE | > > > | | transient_lastDdlTime > | 1674011622 | > {code} > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HIVE-26961) Fix improper replication metric count when hive.repl.filter.transactions is set to true.
[ https://issues.apache.org/jira/browse/HIVE-26961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakshith C reassigned HIVE-26961: - > Fix improper replication metric count when hive.repl.filter.transactions is > set to true. > > > Key: HIVE-26961 > URL: https://issues.apache.org/jira/browse/HIVE-26961 > Project: Hive > Issue Type: Bug >Reporter: Rakshith C >Assignee: Rakshith C >Priority: Major > > Scenario: > when hive.repl.filter.transactions = true, repl dump filters read only > transaction to improve thorughput. > Metrics logged to HMS are improper because there is a mismatch in count > between events read from notification logs and events dumped to staging > directory. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-26943) Fix NPE during Optimised Bootstrap when db is dropped
[ https://issues.apache.org/jira/browse/HIVE-26943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena updated HIVE-26943: Parent: HIVE-25699 Issue Type: Sub-task (was: Task) > Fix NPE during Optimised Bootstrap when db is dropped > - > > Key: HIVE-26943 > URL: https://issues.apache.org/jira/browse/HIVE-26943 > Project: Hive > Issue Type: Sub-task >Reporter: Shreenidhi >Assignee: Shreenidhi >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > Consider the steps: > 1. Current replication is from A (source) -> B(target) > 2. Failover is complete > so now A (target) <- B(source) > 3. Suppose db at A is dropped before reverse replication. > 4. Now when reverse replication triggers optimised bootstrap it will throw NPE > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HIVE-26960) Optimized bootstrap does not drop newly added tables at source.
[ https://issues.apache.org/jira/browse/HIVE-26960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakshith C reassigned HIVE-26960: - > Optimized bootstrap does not drop newly added tables at source. > --- > > Key: HIVE-26960 > URL: https://issues.apache.org/jira/browse/HIVE-26960 > Project: Hive > Issue Type: Bug >Reporter: Rakshith C >Assignee: Rakshith C >Priority: Major > > Scenario: > Replication is setup from DR to PROD after failover from PROD to DR and no > existing tables are modified at PROD but a new table is added at PROD. > Observations: > * _bootstrap directory won't be created during second cycle of optimized > bootstrap because existing tables were not modified. > * Based on this, it will not initialize list of tables to drop at PROD. > * This leads to the new table created at PROD not being dropped. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26943) Fix NPE during Optimised Bootstrap when db is dropped
[ https://issues.apache.org/jira/browse/HIVE-26943?focusedWorklogId=839838=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839838 ] ASF GitHub Bot logged work on HIVE-26943: - Author: ASF GitHub Bot Created on: 18/Jan/23 06:53 Start Date: 18/Jan/23 06:53 Worklog Time Spent: 10m Work Description: shreenidhiSaigaonkar commented on code in PR #3953: URL: https://github.com/apache/hive/pull/3953#discussion_r1073152555 ## ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplLoadTask.java: ## @@ -721,6 +721,10 @@ private int executeIncrementalLoad(long loadStartTime) throws Exception { Database targetDb = getHive().getDatabase(work.dbNameToLoadIn); Map props = new HashMap<>(); +if(targetDb == null) { Review Comment: Done. Issue Time Tracking --- Worklog Id: (was: 839838) Time Spent: 1h (was: 50m) > Fix NPE during Optimised Bootstrap when db is dropped > - > > Key: HIVE-26943 > URL: https://issues.apache.org/jira/browse/HIVE-26943 > Project: Hive > Issue Type: Task >Reporter: Shreenidhi >Assignee: Shreenidhi >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > Consider the steps: > 1. Current replication is from A (source) -> B(target) > 2. Failover is complete > so now A (target) <- B(source) > 3. Suppose db at A is dropped before reverse replication. > 4. Now when reverse replication triggers optimised bootstrap it will throw NPE > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26943) Fix NPE during Optimised Bootstrap when db is dropped
[ https://issues.apache.org/jira/browse/HIVE-26943?focusedWorklogId=839837=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839837 ] ASF GitHub Bot logged work on HIVE-26943: - Author: ASF GitHub Bot Created on: 18/Jan/23 06:47 Start Date: 18/Jan/23 06:47 Worklog Time Spent: 10m Work Description: pudidic commented on code in PR #3953: URL: https://github.com/apache/hive/pull/3953#discussion_r1073149035 ## ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplLoadTask.java: ## @@ -721,6 +721,10 @@ private int executeIncrementalLoad(long loadStartTime) throws Exception { Database targetDb = getHive().getDatabase(work.dbNameToLoadIn); Map props = new HashMap<>(); +if(targetDb == null) { Review Comment: Please follow the coding convention; have a whitespace after if. Issue Time Tracking --- Worklog Id: (was: 839837) Time Spent: 50m (was: 40m) > Fix NPE during Optimised Bootstrap when db is dropped > - > > Key: HIVE-26943 > URL: https://issues.apache.org/jira/browse/HIVE-26943 > Project: Hive > Issue Type: Task >Reporter: Shreenidhi >Assignee: Shreenidhi >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > Consider the steps: > 1. Current replication is from A (source) -> B(target) > 2. Failover is complete > so now A (target) <- B(source) > 3. Suppose db at A is dropped before reverse replication. > 4. Now when reverse replication triggers optimised bootstrap it will throw NPE > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26952) set the value of metastore.storage.schema.reader.impl to org.apache.hadoop.hive.metastore.SerDeStorageSchemaReader as default
[ https://issues.apache.org/jira/browse/HIVE-26952?focusedWorklogId=839836=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839836 ] ASF GitHub Bot logged work on HIVE-26952: - Author: ASF GitHub Bot Created on: 18/Jan/23 06:46 Start Date: 18/Jan/23 06:46 Worklog Time Spent: 10m Work Description: SourabhBadhya commented on code in PR #3959: URL: https://github.com/apache/hive/pull/3959#discussion_r1073148504 ## standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java: ## @@ -67,6 +67,9 @@ public class MetastoreConf { static final String DEFAULT_STORAGE_SCHEMA_READER_CLASS = "org.apache.hadoop.hive.metastore.DefaultStorageSchemaReader"; @VisibleForTesting + static final String SERDE_STORAGE_SCHEMA_READER_CLASS = + "org.apache.hadoop.hive.metastore.SerDeStorageSchemaReader"; Review Comment: Since you are introducing this, could you please add assertion case here - Check the below - https://github.com/apache/hive/blob/c92a478e514a28a53009fe5fbf08ce6fa35b58b9/standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/conf/TestMetastoreConf.java#L482 Issue Time Tracking --- Worklog Id: (was: 839836) Time Spent: 0.5h (was: 20m) > set the value of metastore.storage.schema.reader.impl to > org.apache.hadoop.hive.metastore.SerDeStorageSchemaReader as default > -- > > Key: HIVE-26952 > URL: https://issues.apache.org/jira/browse/HIVE-26952 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Reporter: Taraka Rama Rao Lethavadla >Assignee: Taraka Rama Rao Lethavadla >Priority: Minor > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > With the default value of > > {code:java} > DefaultStorageSchemaReader.class.getName(){code} > > in the Metastore Config, *metastore.storage.schema.reader.impl* > below exception is thrown when trying to read Avro schema > {noformat} > Caused by: org.apache.hive.service.cli.HiveSQLException: MetaException > (message:java.lang.UnsupportedOperationException: Storage schema reading not > supported) > at > org.apache.hive.service.cli.operation.GetColumnsOperation.runInternal(GetColumnsOperation.java:213) > at org.apache.hive.service.cli.operation.Operation.run(Operation.java:247) > at > org.apache.hive.service.cli.session.HiveSessionImpl.getColumns(HiveSessionImpl.java:729) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78) > at > org.apache.hive.service.cli.session.HiveSessionProxy.access-zsh(HiveSessionProxy.java:36) > at > org.apache.hive.service.cli.session.HiveSessionProxy.run(HiveSessionProxy.java:63) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) > at > org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59) > at com.sun.proxy..getColumns(Unknown Source) > at > org.apache.hive.service.cli.CLIService.getColumns(CLIService.java:390){noformat} > setting the above config with > *org.apache.hadoop.hive.metastore.SerDeStorageSchemaReader* resolves issue > Proposing to make this value as default in code base, so that in upcoming > versions we don't have to set this value manually -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26956) Improv find_in_set function
[ https://issues.apache.org/jira/browse/HIVE-26956?focusedWorklogId=839830=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839830 ] ASF GitHub Bot logged work on HIVE-26956: - Author: ASF GitHub Bot Created on: 18/Jan/23 06:03 Start Date: 18/Jan/23 06:03 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #3961: URL: https://github.com/apache/hive/pull/3961#issuecomment-1386535042 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive=3961) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=3961=false=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3961=false=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive=3961=false=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=3961=false=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3961=false=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=3961=false=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3961=false=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3961=false=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3961=false=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive=3961=false=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3961=false=CODE_SMELL) [0 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive=3961=false=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive=3961=coverage=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive=3961=duplicated_lines_density=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 839830) Time Spent: 50m (was: 40m) > Improv find_in_set function > --- > > Key: HIVE-26956 > URL: https://issues.apache.org/jira/browse/HIVE-26956 > Project: Hive > Issue Type: Improvement >Reporter: Bingye Chen >Assignee: Bingye Chen >Priority: Minor > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > Improv find_in_set function -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work started] (HIVE-26601) Fix NPE encountered in second load cycle of optimised bootstrap
[ https://issues.apache.org/jira/browse/HIVE-26601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-26601 started by Vinit Patni. -- > Fix NPE encountered in second load cycle of optimised bootstrap > > > Key: HIVE-26601 > URL: https://issues.apache.org/jira/browse/HIVE-26601 > Project: Hive > Issue Type: Bug >Reporter: Teddy Choi >Assignee: Vinit Patni >Priority: Blocker > > After creating reverse replication policy after failover is completed from > Primary to DR cluster and DR takes over. First dump and load cycle of > optimised bootstrap is completing successfully, Second dump cycle on DR is > also completed which does selective bootstrap of tables that it read from > table_diff directory. However we observed issue with Second load cycle on > Primary Cluster side which is failing with following exception logs that > needs to be fixed. > {code:java} > [Scheduled Query Executor(schedule:repl_vinreverse, execution_id:421)]: > Exception while logging metrics > java.lang.NullPointerException: null > at > org.apache.hadoop.hive.ql.parse.repl.metric.ReplicationMetricCollector.reportStageProgress(ReplicationMetricCollector.java:192) > ~[hive-exec-3.1.3000.7.1.8.0-801.jar:3.1.3000.7.1.8.0-801] > at > org.apache.hadoop.hive.ql.exec.repl.ReplStateLogWork.replStateLog(ReplStateLogWork.java:145) > ~[hive-exec-3.1.3000.7.1.8.0-801.jar:3.1.3000.7.1.8.0-801] > at > org.apache.hadoop.hive.ql.exec.repl.ReplStateLogTask.execute(ReplStateLogTask.java:39) > [hive-exec-3.1.3000.7.1.8.0-801.jar:3.1.3000.7.1.8.0-801] > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213) > [hive-exec-3.1.3000.7.1.8.0-801.jar:3.1.3000.7.1.8.0-801] > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) > [hive-exec-3.1.3000.7.1.8.0-801.jar:3.1.3000.7.1.8.0-801] > at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:357) > [hive-exec-3.1.3000.7.1.8.0-801.jar:3.1.3000.7.1.8.0-801] > at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330) > [hive-exec-3.1.3000.7.1.8.0-801.jar:3.1.3000.7.1.8.0-801] > at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246) > [hive-exec-3.1.3000.7.1.8.0-801.jar:3.1.3000.7.1.8.0-801] > at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109) > [hive-exec-3.1.3000.7.1.8.0-801.jar:3.1.3000.7.1.8.0-801] > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:749) > [hive-exec-3.1.3000.7.1.8.0-801.jar:3.1.3000.7.1.8.0-801] > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:504) > [hive-exec-3.1.3000.7.1.8.0-801.jar:3.1.3000.7.1.8.0-801] > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:498) > [hive-exec-3.1.3000.7.1.8.0-801.jar:3.1.3000.7.1.8.0-801] > at > org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166) > [hive-exec-3.1.3000.7.1.8.0-801.jar:3.1.3000.7.1.8.0-801] > at > org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:232) > [hive-exec-3.1.3000.7.1.8.0-801.jar:3.1.3000.7.1.8.0-801] > at > org.apache.hadoop.hive.ql.scheduled.ScheduledQueryExecutionService$ScheduledQueryExecutor.processQuery(ScheduledQueryExecutionService.java:240) > [hive-exec-3.1.3000.7.1.8.0-801.jar:3.1.3000.7.1.8.0-801] > at > org.apache.hadoop.hive.ql.scheduled.ScheduledQueryExecutionService$ScheduledQueryExecutor.run(ScheduledQueryExecutionService.java:193) > [hive-exec-3.1.3000.7.1.8.0-801.jar:3.1.3000.7.1.8.0-801] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [?:1.8.0_232] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > [?:1.8.0_232] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [?:1.8.0_232] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [?:1.8.0_232] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_232]{code} > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-26599) Fix NPE encountered in second dump cycle of optimised bootstrap
[ https://issues.apache.org/jira/browse/HIVE-26599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinit Patni updated HIVE-26599: --- Status: Patch Available (was: In Progress) > Fix NPE encountered in second dump cycle of optimised bootstrap > --- > > Key: HIVE-26599 > URL: https://issues.apache.org/jira/browse/HIVE-26599 > Project: Hive > Issue Type: Bug >Reporter: Teddy Choi >Assignee: Vinit Patni >Priority: Blocker > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > After creating reverse replication policy after failover is completed from > Primary to DR cluster and DR takes over. First dump and load cycle of > optimised bootstrap is completing successfully, But We are encountering Null > pointer exception in the second dump cycle which is halting this reverse > replication and major blocker to test complete cycle of replication. > {code:java} > Scheduled Query Executor(schedule:repl_reverse, execution_id:14)]: FAILED: > Execution Error, return code -101 from > org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask. > java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.parse.repl.metric.ReplicationMetricCollector.reportStageProgress(ReplicationMetricCollector.java:192) > at > org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.dumpTable(ReplDumpTask.java:1458) > at > org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.incrementalDump(ReplDumpTask.java:961) > at > org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.execute(ReplDumpTask.java:290) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) > at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:357) > at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330) > at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246) > at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:749) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:504) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:498) > at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166) > at > org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:232){code} > After doing RCA, we figured out that In second dump cycle on DR cluster when > StageStart method is invoked by code, metrics corresponding to Tables is not > being registered (which should be registered as we are doing selective > bootstrap of tables for optimise bootstrap along with incremental dump) which > is causing NPE when it is trying to update the progress corresponding to this > metric latter on after bootstrap of table is completed. > Fix is to register the Tables metric before updating the progress. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work started] (HIVE-26599) Fix NPE encountered in second dump cycle of optimised bootstrap
[ https://issues.apache.org/jira/browse/HIVE-26599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-26599 started by Vinit Patni. -- > Fix NPE encountered in second dump cycle of optimised bootstrap > --- > > Key: HIVE-26599 > URL: https://issues.apache.org/jira/browse/HIVE-26599 > Project: Hive > Issue Type: Bug >Reporter: Teddy Choi >Assignee: Vinit Patni >Priority: Blocker > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > After creating reverse replication policy after failover is completed from > Primary to DR cluster and DR takes over. First dump and load cycle of > optimised bootstrap is completing successfully, But We are encountering Null > pointer exception in the second dump cycle which is halting this reverse > replication and major blocker to test complete cycle of replication. > {code:java} > Scheduled Query Executor(schedule:repl_reverse, execution_id:14)]: FAILED: > Execution Error, return code -101 from > org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask. > java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.parse.repl.metric.ReplicationMetricCollector.reportStageProgress(ReplicationMetricCollector.java:192) > at > org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.dumpTable(ReplDumpTask.java:1458) > at > org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.incrementalDump(ReplDumpTask.java:961) > at > org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.execute(ReplDumpTask.java:290) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) > at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:357) > at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330) > at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246) > at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:749) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:504) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:498) > at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166) > at > org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:232){code} > After doing RCA, we figured out that In second dump cycle on DR cluster when > StageStart method is invoked by code, metrics corresponding to Tables is not > being registered (which should be registered as we are doing selective > bootstrap of tables for optimise bootstrap along with incremental dump) which > is causing NPE when it is trying to update the progress corresponding to this > metric latter on after bootstrap of table is completed. > Fix is to register the Tables metric before updating the progress. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26599) Fix NPE encountered in second dump cycle of optimised bootstrap
[ https://issues.apache.org/jira/browse/HIVE-26599?focusedWorklogId=839826=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839826 ] ASF GitHub Bot logged work on HIVE-26599: - Author: ASF GitHub Bot Created on: 18/Jan/23 05:32 Start Date: 18/Jan/23 05:32 Worklog Time Spent: 10m Work Description: vinitpatni opened a new pull request, #3963: URL: https://github.com/apache/hive/pull/3963 …d bootstrap ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? Issue Time Tracking --- Worklog Id: (was: 839826) Remaining Estimate: 0h Time Spent: 10m > Fix NPE encountered in second dump cycle of optimised bootstrap > --- > > Key: HIVE-26599 > URL: https://issues.apache.org/jira/browse/HIVE-26599 > Project: Hive > Issue Type: Bug >Reporter: Teddy Choi >Assignee: Vinit Patni >Priority: Blocker > Time Spent: 10m > Remaining Estimate: 0h > > After creating reverse replication policy after failover is completed from > Primary to DR cluster and DR takes over. First dump and load cycle of > optimised bootstrap is completing successfully, But We are encountering Null > pointer exception in the second dump cycle which is halting this reverse > replication and major blocker to test complete cycle of replication. > {code:java} > Scheduled Query Executor(schedule:repl_reverse, execution_id:14)]: FAILED: > Execution Error, return code -101 from > org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask. > java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.parse.repl.metric.ReplicationMetricCollector.reportStageProgress(ReplicationMetricCollector.java:192) > at > org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.dumpTable(ReplDumpTask.java:1458) > at > org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.incrementalDump(ReplDumpTask.java:961) > at > org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.execute(ReplDumpTask.java:290) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) > at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:357) > at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330) > at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246) > at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:749) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:504) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:498) > at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166) > at > org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:232){code} > After doing RCA, we figured out that In second dump cycle on DR cluster when > StageStart method is invoked by code, metrics corresponding to Tables is not > being registered (which should be registered as we are doing selective > bootstrap of tables for optimise bootstrap along with incremental dump) which > is causing NPE when it is trying to update the progress corresponding to this > metric latter on after bootstrap of table is completed. > Fix is to register the Tables metric before updating the progress. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-26599) Fix NPE encountered in second dump cycle of optimised bootstrap
[ https://issues.apache.org/jira/browse/HIVE-26599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-26599: -- Labels: pull-request-available (was: ) > Fix NPE encountered in second dump cycle of optimised bootstrap > --- > > Key: HIVE-26599 > URL: https://issues.apache.org/jira/browse/HIVE-26599 > Project: Hive > Issue Type: Bug >Reporter: Teddy Choi >Assignee: Vinit Patni >Priority: Blocker > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > After creating reverse replication policy after failover is completed from > Primary to DR cluster and DR takes over. First dump and load cycle of > optimised bootstrap is completing successfully, But We are encountering Null > pointer exception in the second dump cycle which is halting this reverse > replication and major blocker to test complete cycle of replication. > {code:java} > Scheduled Query Executor(schedule:repl_reverse, execution_id:14)]: FAILED: > Execution Error, return code -101 from > org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask. > java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.parse.repl.metric.ReplicationMetricCollector.reportStageProgress(ReplicationMetricCollector.java:192) > at > org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.dumpTable(ReplDumpTask.java:1458) > at > org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.incrementalDump(ReplDumpTask.java:961) > at > org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.execute(ReplDumpTask.java:290) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) > at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:357) > at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330) > at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246) > at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:749) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:504) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:498) > at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166) > at > org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:232){code} > After doing RCA, we figured out that In second dump cycle on DR cluster when > StageStart method is invoked by code, metrics corresponding to Tables is not > being registered (which should be registered as we are doing selective > bootstrap of tables for optimise bootstrap along with incremental dump) which > is causing NPE when it is trying to update the progress corresponding to this > metric latter on after bootstrap of table is completed. > Fix is to register the Tables metric before updating the progress. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (HIVE-26597) Fix unsetting of db prop repl.target.for in ReplicationSemanticAnalyzer
[ https://issues.apache.org/jira/browse/HIVE-26597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakshith C resolved HIVE-26597. --- Resolution: Fixed > Fix unsetting of db prop repl.target.for in ReplicationSemanticAnalyzer > --- > > Key: HIVE-26597 > URL: https://issues.apache.org/jira/browse/HIVE-26597 > Project: Hive > Issue Type: Bug >Reporter: Teddy Choi >Assignee: Rakshith C >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > when repl policy is set from A -> B > * *repl.target.for* is set on B. > when failover is initiated > * *repl.failover.endpoint* = *'TARGET'* is set on B. > > now when reverse policy is set up from {*}A <- B{*}; > there is a check in > [ReplicationSemanticAnalyzer#initReplDump|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSemanticAnalyzer.java#L196] > which checks for existence of these two properties and if they are set, > it unsets the *repl.target.for* property. > Because of this optimisedBootstrap won't be triggered because it checks for > the existence of *repl.target.for* property during repl dump on target > [HERE|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/repl/OptimisedBootstrapUtils.java#L93]. > > Fix : remove the code which unsets repl.target.for in > ReplicationSemanticAnalyzer, because second dump cycle of optimized bootstrap > unsets it. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26597) Fix unsetting of db prop repl.target.for in ReplicationSemanticAnalyzer
[ https://issues.apache.org/jira/browse/HIVE-26597?focusedWorklogId=839824=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839824 ] ASF GitHub Bot logged work on HIVE-26597: - Author: ASF GitHub Bot Created on: 18/Jan/23 04:35 Start Date: 18/Jan/23 04:35 Worklog Time Spent: 10m Work Description: pudidic merged PR #3788: URL: https://github.com/apache/hive/pull/3788 Issue Time Tracking --- Worklog Id: (was: 839824) Time Spent: 50m (was: 40m) > Fix unsetting of db prop repl.target.for in ReplicationSemanticAnalyzer > --- > > Key: HIVE-26597 > URL: https://issues.apache.org/jira/browse/HIVE-26597 > Project: Hive > Issue Type: Bug >Reporter: Teddy Choi >Assignee: Rakshith C >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > when repl policy is set from A -> B > * *repl.target.for* is set on B. > when failover is initiated > * *repl.failover.endpoint* = *'TARGET'* is set on B. > > now when reverse policy is set up from {*}A <- B{*}; > there is a check in > [ReplicationSemanticAnalyzer#initReplDump|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSemanticAnalyzer.java#L196] > which checks for existence of these two properties and if they are set, > it unsets the *repl.target.for* property. > Because of this optimisedBootstrap won't be triggered because it checks for > the existence of *repl.target.for* property during repl dump on target > [HERE|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/repl/OptimisedBootstrapUtils.java#L93]. > > Fix : remove the code which unsets repl.target.for in > ReplicationSemanticAnalyzer, because second dump cycle of optimized bootstrap > unsets it. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26597) Fix unsetting of db prop repl.target.for in ReplicationSemanticAnalyzer
[ https://issues.apache.org/jira/browse/HIVE-26597?focusedWorklogId=839823=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839823 ] ASF GitHub Bot logged work on HIVE-26597: - Author: ASF GitHub Bot Created on: 18/Jan/23 04:34 Start Date: 18/Jan/23 04:34 Worklog Time Spent: 10m Work Description: pudidic commented on PR #3788: URL: https://github.com/apache/hive/pull/3788#issuecomment-1386474212 LGTM +1. I'll merge it as it's a trivial change. Thank you. :) Issue Time Tracking --- Worklog Id: (was: 839823) Time Spent: 40m (was: 0.5h) > Fix unsetting of db prop repl.target.for in ReplicationSemanticAnalyzer > --- > > Key: HIVE-26597 > URL: https://issues.apache.org/jira/browse/HIVE-26597 > Project: Hive > Issue Type: Bug >Reporter: Teddy Choi >Assignee: Rakshith C >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > when repl policy is set from A -> B > * *repl.target.for* is set on B. > when failover is initiated > * *repl.failover.endpoint* = *'TARGET'* is set on B. > > now when reverse policy is set up from {*}A <- B{*}; > there is a check in > [ReplicationSemanticAnalyzer#initReplDump|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSemanticAnalyzer.java#L196] > which checks for existence of these two properties and if they are set, > it unsets the *repl.target.for* property. > Because of this optimisedBootstrap won't be triggered because it checks for > the existence of *repl.target.for* property during repl dump on target > [HERE|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/repl/OptimisedBootstrapUtils.java#L93]. > > Fix : remove the code which unsets repl.target.for in > ReplicationSemanticAnalyzer, because second dump cycle of optimized bootstrap > unsets it. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26956) Improv find_in_set function
[ https://issues.apache.org/jira/browse/HIVE-26956?focusedWorklogId=839814=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839814 ] ASF GitHub Bot logged work on HIVE-26956: - Author: ASF GitHub Bot Created on: 18/Jan/23 03:23 Start Date: 18/Jan/23 03:23 Worklog Time Spent: 10m Work Description: TaoZex opened a new pull request, #3961: URL: https://github.com/apache/hive/pull/3961 ### What changes were proposed in this pull request? Improve find_in_set function ### Why are the changes needed? Code redundancy ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? Issue Time Tracking --- Worklog Id: (was: 839814) Time Spent: 40m (was: 0.5h) > Improv find_in_set function > --- > > Key: HIVE-26956 > URL: https://issues.apache.org/jira/browse/HIVE-26956 > Project: Hive > Issue Type: Improvement >Reporter: Bingye Chen >Assignee: Bingye Chen >Priority: Minor > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > Improv find_in_set function -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26956) Improv find_in_set function
[ https://issues.apache.org/jira/browse/HIVE-26956?focusedWorklogId=839813=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839813 ] ASF GitHub Bot logged work on HIVE-26956: - Author: ASF GitHub Bot Created on: 18/Jan/23 03:23 Start Date: 18/Jan/23 03:23 Worklog Time Spent: 10m Work Description: TaoZex closed pull request #3961: HIVE-26956: Improve find_in_set function URL: https://github.com/apache/hive/pull/3961 Issue Time Tracking --- Worklog Id: (was: 839813) Time Spent: 0.5h (was: 20m) > Improv find_in_set function > --- > > Key: HIVE-26956 > URL: https://issues.apache.org/jira/browse/HIVE-26956 > Project: Hive > Issue Type: Improvement >Reporter: Bingye Chen >Assignee: Bingye Chen >Priority: Minor > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Improv find_in_set function -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26739) When kerberos is enabled, hiveserver2 error connecting metastore: No valid credentials provided
[ https://issues.apache.org/jira/browse/HIVE-26739?focusedWorklogId=839812=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839812 ] ASF GitHub Bot logged work on HIVE-26739: - Author: ASF GitHub Bot Created on: 18/Jan/23 03:17 Start Date: 18/Jan/23 03:17 Worklog Time Spent: 10m Work Description: xiuzhu9527 commented on PR #3764: URL: https://github.com/apache/hive/pull/3764#issuecomment-1386425591 thx! Issue Time Tracking --- Worklog Id: (was: 839812) Time Spent: 50m (was: 40m) > When kerberos is enabled, hiveserver2 error connecting metastore: No valid > credentials provided > --- > > Key: HIVE-26739 > URL: https://issues.apache.org/jira/browse/HIVE-26739 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0, 3.0.0 >Reporter: weiliang hao >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > If the environment variable HADOOP_USER_NAME exists, hiveserver2 error > connecting metastore: No valid credentials provided. > There is a problem with the getUGI method of the > org.apache.hadoop.hive.shims.Utils class to obtain the UGI. It should be > added to determine whether 'UserGroupInformation IsSecurityEnabled () `. If > it is true, it returns' UserGroupInformation GetCurrentUser() `. If it is > false, the user name is obtained from the environment variable > HADOOP_USER_NAME to create a UGI > > {code:java} > 2022-11-15T15:41:06,971 ERROR [HiveServer2-Background-Pool: Thread-36] > transport.TSaslTransport: SASL negotiation failure > javax.security.sasl.SaslException: GSS initiate failed > at > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) > ~[?:1.8.0_144] > at > org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94) > ~[hive-exec-3.1.3.jar:3.1.3] > at > org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) > ~[hive-exec-3.1.3.jar:3.1.3] > at > org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) > ~[hive-exec-3.1.3.jar:3.1.3] > at > org.apache.hadoop.hive.metastore.security.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:51) > ~[hive-exec-3.1.3.jar:3.1.3] > at > org.apache.hadoop.hive.metastore.security.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:48) > ~[hive-exec-3.1.3.jar:3.1.3] > at java.security.AccessController.doPrivileged(Native Method) > ~[?:1.8.0_144] > at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_144] > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) > ~[hadoop-common-3.2.1.jar:?] > at > org.apache.hadoop.hive.metastore.security.TUGIAssumingTransport.open(TUGIAssumingTransport.java:48) > ~[hive-exec-3.1.3.jar:3.1.3] > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:516) > ~[hive-exec-3.1.3.jar:3.1.3] > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:224) > ~[hive-exec-3.1.3.jar:3.1.3] > at > org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.(SessionHiveMetaStoreClient.java:94) > ~[hive-exec-3.1.3.jar:3.1.3] > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) ~[?:1.8.0_144] > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > ~[?:1.8.0_144] > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > ~[?:1.8.0_144] > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > ~[?:1.8.0_144] > at > org.apache.hadoop.hive.metastore.utils.JavaUtils.newInstance(JavaUtils.java:84) > ~[hive-exec-3.1.3.jar:3.1.3] > at > org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:95) > ~[hive-exec-3.1.3.jar:3.1.3] > at > org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:148) > ~[hive-exec-3.1.3.jar:3.1.3] > at > org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:119) > ~[hive-exec-3.1.3.jar:3.1.3] > at > org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:4306) > ~[hive-exec-3.1.3.jar:3.1.3] > at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:4374) > ~[hive-exec-3.1.3.jar:3.1.3] > at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:4354) > ~[hive-exec-3.1.3.jar:3.1.3] > at >
[jira] [Work logged] (HIVE-26904) QueryCompactor failed in commitCompaction if the tmp table dir is already removed
[ https://issues.apache.org/jira/browse/HIVE-26904?focusedWorklogId=839811=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839811 ] ASF GitHub Bot logged work on HIVE-26904: - Author: ASF GitHub Bot Created on: 18/Jan/23 03:10 Start Date: 18/Jan/23 03:10 Worklog Time Spent: 10m Work Description: stiga-huang commented on code in PR #3910: URL: https://github.com/apache/hive/pull/3910#discussion_r1073054442 ## ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/QueryCompactor.java: ## @@ -245,16 +243,32 @@ private static void disableLlapCaching(HiveConf conf) { * @throws IOException the directory cannot be deleted * @throws HiveException the table is not found */ -static void cleanupEmptyDir(HiveConf conf, String tmpTableName) throws IOException, HiveException { +static void cleanupEmptyTableDir(HiveConf conf, String tmpTableName) +throws IOException, HiveException { org.apache.hadoop.hive.ql.metadata.Table tmpTable = Hive.get().getTable(tmpTableName); if (tmpTable != null) { -Path path = new Path(tmpTable.getSd().getLocation()); -FileSystem fs = path.getFileSystem(conf); +cleanupEmptyDir(conf, new Path(tmpTable.getSd().getLocation())); + } +} + +/** + * Remove the directory if it's empty. + * @param conf the Hive configuration + * @param path path of the directory + * @throws IOException if any IO error occurs + */ +static void cleanupEmptyDir(HiveConf conf, Path path) throws IOException { + FileSystem fs = path.getFileSystem(conf); + try { if (!fs.listFiles(path, false).hasNext()) { fs.delete(path, true); } + } catch (FileNotFoundException e) { +// Ignore the case when the dir was already removed +LOG.warn("Ignored exception during cleanup {}", path, e); Review Comment: FWIW, the following log shows the stacktrace of where the `FileNotFoundException` is thrown: ``` 2023-01-02T02:12:55,849 ERROR [impala-ec2-centos79-m6i-4xlarge-ondemand-1428.vpc.cloudera.com-48_executor] compactor.Worker: Caught exception while trying to compact id:15,dbname:partial_catalog_info_test,tableName:insert_only_partitioned,partName:part=1,state:^@,type:MINOR,enqueueTime:0,start:0,properties:null,runAs:jenkins,tooManyAborts:false,hasOldAbort:false,highestWriteId:3,errorMessage:null,workerId: null,initiatorId: null,retryRetention0. Marking failed to avoid repeated failures java.io.FileNotFoundException: File hdfs://localhost:20500/tmp/hive/jenkins/092b533a-81c8-4b95-88e4-9472cf6f365d/_tmp_space.db/62ec04fb-e2d2-4a99-a454-ae709a3cccfe does not exist. at org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1275) ~[hadoop-hdfs-client-3.1.1.7.2.15.4-6.jar:?] at org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1249) ~[hadoop-hdfs-client-3.1.1.7.2.15.4-6.jar:?] at org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1194) ~[hadoop-hdfs-client-3.1.1.7.2.15.4-6.jar:?] at org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1190) ~[hadoop-hdfs-client-3.1.1.7.2.15.4-6.jar:?] at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) ~[hadoop-common-3.1.1.7.2.15.4-6.jar:?] at org.apache.hadoop.hdfs.DistributedFileSystem.listLocatedStatus(DistributedFileSystem.java:1208) ~[hadoop-hdfs-client-3.1.1.7.2.15.4-6.jar:?] at org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:2144) ~[hadoop-common-3.1.1.7.2.15.4-6.jar:?] at org.apache.hadoop.fs.FileSystem$5.(FileSystem.java:2302) ~[hadoop-common-3.1.1.7.2.15.4-6.jar:?] at org.apache.hadoop.fs.FileSystem.listFiles(FileSystem.java:2299) ~[hadoop-common-3.1.1.7.2.15.4-6.jar:?] at org.apache.hadoop.hive.ql.txn.compactor.QueryCompactor$Util.cleanupEmptyDir(QueryCompactor.java:261) ~[hive-exec-3.1.3000.2022.0.13.0-60.jar:3.1.3000.2022.0.13.0-60] at org.apache.hadoop.hive.ql.txn.compactor.MmMinorQueryCompactor.commitCompaction(MmMinorQueryCompactor.java:72) ~[hive-exec-3.1.3000.2022.0.13.0-60.jar:3.1.3000.2022.0.13.0-60] at org.apache.hadoop.hive.ql.txn.compactor.QueryCompactor.runCompactionQueries(QueryCompactor.java:146) ~[hive-exec-3.1.3000.2022.0.13.0-60.jar:3.1.3000.2022.0.13.0-60] at org.apache.hadoop.hive.ql.txn.compactor.MmMinorQueryCompactor.runCompaction(MmMinorQueryCompactor.java:63) ~[hive-exec-3.1.3000.2022.0.13.0-60.jar:3.1.3000.2022.0.13.0-60] at org.apache.hadoop.hive.ql.txn.compactor.Worker.findNextCompactionAndExecute(Worker.java:435) ~[hive-exec-3.1.3000.2022.0.13.0-60.jar:3.1.3000.2022.0.13.0-60]
[jira] [Work logged] (HIVE-26915) Backport of HIVE-23692 TestCodahaleMetrics.testFileReporting is flaky
[ https://issues.apache.org/jira/browse/HIVE-26915?focusedWorklogId=839810=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839810 ] ASF GitHub Bot logged work on HIVE-26915: - Author: ASF GitHub Bot Created on: 18/Jan/23 02:46 Start Date: 18/Jan/23 02:46 Worklog Time Spent: 10m Work Description: amanraj2520 commented on PR #3928: URL: https://github.com/apache/hive/pull/3928#issuecomment-1386397424 @zabetak @abstractdog Can you please review and merge this Issue Time Tracking --- Worklog Id: (was: 839810) Time Spent: 1h 10m (was: 1h) > Backport of HIVE-23692 TestCodahaleMetrics.testFileReporting is flaky > - > > Key: HIVE-26915 > URL: https://issues.apache.org/jira/browse/HIVE-26915 > Project: Hive > Issue Type: Sub-task >Reporter: Aman Raj >Assignee: Aman Raj >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > This was committed in master without a HIVE Jira task. This is the commit id > : 130f80445d589cdd82904cea1073c84d1368d079 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26915) Backport of HIVE-23692 TestCodahaleMetrics.testFileReporting is flaky
[ https://issues.apache.org/jira/browse/HIVE-26915?focusedWorklogId=839809=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839809 ] ASF GitHub Bot logged work on HIVE-26915: - Author: ASF GitHub Bot Created on: 18/Jan/23 02:45 Start Date: 18/Jan/23 02:45 Worklog Time Spent: 10m Work Description: amanraj2520 commented on PR #3928: URL: https://github.com/apache/hive/pull/3928#issuecomment-1386395934 @zabetak Here is another flakiness http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-3954/4/tests Issue Time Tracking --- Worklog Id: (was: 839809) Time Spent: 1h (was: 50m) > Backport of HIVE-23692 TestCodahaleMetrics.testFileReporting is flaky > - > > Key: HIVE-26915 > URL: https://issues.apache.org/jira/browse/HIVE-26915 > Project: Hive > Issue Type: Sub-task >Reporter: Aman Raj >Assignee: Aman Raj >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > This was committed in master without a HIVE Jira task. This is the commit id > : 130f80445d589cdd82904cea1073c84d1368d079 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26945) Test fixes for query*.q files
[ https://issues.apache.org/jira/browse/HIVE-26945?focusedWorklogId=839808=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839808 ] ASF GitHub Bot logged work on HIVE-26945: - Author: ASF GitHub Bot Created on: 18/Jan/23 02:42 Start Date: 18/Jan/23 02:42 Worklog Time Spent: 10m Work Description: amanraj2520 commented on PR #3954: URL: https://github.com/apache/hive/pull/3954#issuecomment-1386394089 Hi @zabetak @abstractdog Can you please approve this. There is one flaky test that is failing. I have fixed that in https://github.com/apache/hive/pull/3928 Issue Time Tracking --- Worklog Id: (was: 839808) Time Spent: 20m (was: 10m) > Test fixes for query*.q files > - > > Key: HIVE-26945 > URL: https://issues.apache.org/jira/browse/HIVE-26945 > Project: Hive > Issue Type: Sub-task >Reporter: Aman Raj >Assignee: Aman Raj >Priority: Critical > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > The tests has outdated q.out files which need to be updated. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HIVE-26893) Extend batch partition APIs to ignore partition schemas
[ https://issues.apache.org/jira/browse/HIVE-26893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sai Hemanth Gantasala reassigned HIVE-26893: Assignee: Sai Hemanth Gantasala > Extend batch partition APIs to ignore partition schemas > --- > > Key: HIVE-26893 > URL: https://issues.apache.org/jira/browse/HIVE-26893 > Project: Hive > Issue Type: New Feature > Components: Metastore >Reporter: Quanlong Huang >Assignee: Sai Hemanth Gantasala >Priority: Major > > There are several HMS APIs that return a list of partitions, e.g. > get_partitions_ps(), get_partitions_by_names(), add_partitions_req() with > needResult=true, etc. Each partition instance will have a unique list of > FieldSchemas as the partition schema: > {code:java} > org.apache.hadoop.hive.metastore.api.Partition > -> org.apache.hadoop.hive.metastore.api.StorageDescriptor >-> cols: list {code} > This could occupy a large memory footprint for wide tables (e.g. with 2k > cols). See the heap histogram in IMPALA-11812 as an example. > Some engines like Impala doesn't actually use/respect the partition level > schema. It's a waste of network/serde resource to transmit them. It'd be nice > if these APIs provide an optional boolean flag for ignoring partition > schemas. So HMS clients (e.g. Impala) don't need to clear them later (to save > mem). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26648) Upgrade Bouncy Castle to 1.70 due to high CVEs
[ https://issues.apache.org/jira/browse/HIVE-26648?focusedWorklogId=839786=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839786 ] ASF GitHub Bot logged work on HIVE-26648: - Author: ASF GitHub Bot Created on: 18/Jan/23 00:21 Start Date: 18/Jan/23 00:21 Worklog Time Spent: 10m Work Description: github-actions[bot] closed pull request #3744: HIVE-26648:removing direct depedency of bouncycastle URL: https://github.com/apache/hive/pull/3744 Issue Time Tracking --- Worklog Id: (was: 839786) Time Spent: 2.5h (was: 2h 20m) > Upgrade Bouncy Castle to 1.70 due to high CVEs > --- > > Key: HIVE-26648 > URL: https://issues.apache.org/jira/browse/HIVE-26648 > Project: Hive > Issue Type: Task >Reporter: Devaspati Krishnatri >Assignee: Devaspati Krishnatri >Priority: Major > Labels: pull-request-available > Time Spent: 2.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26598) Fix unsetting of db params for optimized bootstrap when repl dump initiates data copy
[ https://issues.apache.org/jira/browse/HIVE-26598?focusedWorklogId=839784=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839784 ] ASF GitHub Bot logged work on HIVE-26598: - Author: ASF GitHub Bot Created on: 18/Jan/23 00:21 Start Date: 18/Jan/23 00:21 Worklog Time Spent: 10m Work Description: github-actions[bot] commented on PR #3780: URL: https://github.com/apache/hive/pull/3780#issuecomment-1386276643 This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Feel free to reach out on the d...@hive.apache.org list if the patch is in need of reviews. Issue Time Tracking --- Worklog Id: (was: 839784) Time Spent: 0.5h (was: 20m) > Fix unsetting of db params for optimized bootstrap when repl dump initiates > data copy > - > > Key: HIVE-26598 > URL: https://issues.apache.org/jira/browse/HIVE-26598 > Project: Hive > Issue Type: Bug >Reporter: Teddy Choi >Assignee: Rakshith C >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > when hive.repl.run.data.copy.tasks.on.target is set to false, repl dump task > will initiate the copy task from source cluster to staging directory. > In current code flow repl dump task dumps the metadata and then creates > another repl dump task with datacopyIterators initialized. > when the second dump cycle executes, it directly begins data copy tasks. > Because of this we don't enter second reverse dump flow and > unsetDbPropertiesForOptimisedBootstrap is never set to true again. > this results in db params (repl.target.for, repl.background.threads, etc) not > being unset. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26648) Upgrade Bouncy Castle to 1.70 due to high CVEs
[ https://issues.apache.org/jira/browse/HIVE-26648?focusedWorklogId=839787=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839787 ] ASF GitHub Bot logged work on HIVE-26648: - Author: ASF GitHub Bot Created on: 18/Jan/23 00:21 Start Date: 18/Jan/23 00:21 Worklog Time Spent: 10m Work Description: github-actions[bot] closed pull request #3727: HIVE-26648:Upgrade Bouncy Castle to 1.70 due to high CVEs URL: https://github.com/apache/hive/pull/3727 Issue Time Tracking --- Worklog Id: (was: 839787) Time Spent: 2h 40m (was: 2.5h) > Upgrade Bouncy Castle to 1.70 due to high CVEs > --- > > Key: HIVE-26648 > URL: https://issues.apache.org/jira/browse/HIVE-26648 > Project: Hive > Issue Type: Task >Reporter: Devaspati Krishnatri >Assignee: Devaspati Krishnatri >Priority: Major > Labels: pull-request-available > Time Spent: 2h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26757) Add sfs+ofs support
[ https://issues.apache.org/jira/browse/HIVE-26757?focusedWorklogId=839785=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839785 ] ASF GitHub Bot logged work on HIVE-26757: - Author: ASF GitHub Bot Created on: 18/Jan/23 00:21 Start Date: 18/Jan/23 00:21 Worklog Time Spent: 10m Work Description: github-actions[bot] commented on PR #3779: URL: https://github.com/apache/hive/pull/3779#issuecomment-1386276682 This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Feel free to reach out on the d...@hive.apache.org list if the patch is in need of reviews. Issue Time Tracking --- Worklog Id: (was: 839785) Time Spent: 1h (was: 50m) > Add sfs+ofs support > --- > > Key: HIVE-26757 > URL: https://issues.apache.org/jira/browse/HIVE-26757 > Project: Hive > Issue Type: Improvement >Reporter: Michael Smith >Assignee: Michael Smith >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > [https://github.com/apache/hive/blob/ebb1e2fa9914bcccecad261d53338933b699ccb1/ql/src/java/org/apache/hadoop/hive/ql/io/SingleFileSystem.java#L80] > shows SFS support for Ozone's o3fs protocol, but not the newer ofs protocol. > Please add support for {{{}sfs+ofs{}}}. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26925) MV with iceberg storage format fails when contains 'PARTITIONED ON' clause due to column number/types difference.
[ https://issues.apache.org/jira/browse/HIVE-26925?focusedWorklogId=839778=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839778 ] ASF GitHub Bot logged work on HIVE-26925: - Author: ASF GitHub Bot Created on: 17/Jan/23 23:13 Start Date: 17/Jan/23 23:13 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #3939: URL: https://github.com/apache/hive/pull/3939#issuecomment-1386210004 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive=3939) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=3939=false=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3939=false=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive=3939=false=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=3939=false=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3939=false=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=3939=false=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3939=false=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3939=false=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3939=false=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive=3939=false=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3939=false=CODE_SMELL) [0 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive=3939=false=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive=3939=coverage=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive=3939=duplicated_lines_density=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 839778) Time Spent: 1h 50m (was: 1h 40m) > MV with iceberg storage format fails when contains 'PARTITIONED ON' clause > due to column number/types difference. > - > > Key: HIVE-26925 > URL: https://issues.apache.org/jira/browse/HIVE-26925 > Project: Hive > Issue Type: Bug > Components: Iceberg integration >Reporter: Dharmik Thakkar >Assignee: Krisztian Kasa >Priority: Critical > Labels: pull-request-available > Time Spent: 1h 50m > Remaining Estimate: 0h > > MV with iceberg storage format fails when contains 'PARTITIONED ON' clause > due to column number/types difference. > {code:java} > !!! annotations iceberg > >>> use iceberg_test_db_hive; > No rows affected > >>> set hive.exec.max.dynamic.partitions=2000; > >>> set hive.exec.max.dynamic.partitions.pernode=2000; > >>> drop materialized view if exists mv_agg_gby_col_partitioned; > >>> create materialized view mv_agg_gby_col_partitioned PARTITIONED ON (t) > >>> stored by iceberg stored as orc tblproperties ('format-version'='1') as > >>> select b,f,sum(b), sum(f),t from all100k group by b,f,v,c,t; > >>> analyze table mv_agg_gby_col_partitioned compute statistics for columns; > >>> set hive.explain.user=false; > >>> explain select b,f,sum(b) from all100k where t=93 group by c,v,f,b; > !!! match row_contains >
[jira] [Work logged] (HIVE-26928) LlapIoImpl::getParquetFooterBuffersFromCache throws exception when metadata cache is disabled
[ https://issues.apache.org/jira/browse/HIVE-26928?focusedWorklogId=839776=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839776 ] ASF GitHub Bot logged work on HIVE-26928: - Author: ASF GitHub Bot Created on: 17/Jan/23 22:48 Start Date: 17/Jan/23 22:48 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #3962: URL: https://github.com/apache/hive/pull/3962#issuecomment-1386187492 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive=3962) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=3962=false=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3962=false=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive=3962=false=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=3962=false=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3962=false=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=3962=false=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3962=false=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3962=false=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3962=false=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive=3962=false=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3962=false=CODE_SMELL) [0 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive=3962=false=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive=3962=coverage=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive=3962=duplicated_lines_density=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 839776) Time Spent: 20m (was: 10m) > LlapIoImpl::getParquetFooterBuffersFromCache throws exception when metadata > cache is disabled > - > > Key: HIVE-26928 > URL: https://issues.apache.org/jira/browse/HIVE-26928 > Project: Hive > Issue Type: Improvement > Components: Iceberg integration >Reporter: Rajesh Balamohan >Assignee: Simhadri Govindappa >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > When metadata / LLAP cache is disabled, "iceberg + parquet" throws the > following error. "{color:#5a656d}hive.llap.io.memory.mode=none"{color} > It should check for "metadatacache" correctly or fix it in LlapIoImpl. > > {noformat} > Caused by: java.lang.NullPointerException: Metadata cache must not be null > at > com.google.common.base.Preconditions.checkNotNull(Preconditions.java:897) > at > org.apache.hadoop.hive.llap.io.api.impl.LlapIoImpl.getParquetFooterBuffersFromCache(LlapIoImpl.java:467) > at > org.apache.iceberg.mr.hive.vector.HiveVectorizedReader.parquetRecordReader(HiveVectorizedReader.java:227) > at > org.apache.iceberg.mr.hive.vector.HiveVectorizedReader.reader(HiveVectorizedReader.java:162) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > at >
[jira] [Work logged] (HIVE-26924) Alter materialized view enable rewrite throws SemanticException for source iceberg table
[ https://issues.apache.org/jira/browse/HIVE-26924?focusedWorklogId=839767=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839767 ] ASF GitHub Bot logged work on HIVE-26924: - Author: ASF GitHub Bot Created on: 17/Jan/23 21:39 Start Date: 17/Jan/23 21:39 Worklog Time Spent: 10m Work Description: scarlin-cloudera commented on code in PR #3936: URL: https://github.com/apache/hive/pull/3936#discussion_r1072857383 ## ql/src/java/org/apache/hadoop/hive/ql/ddl/view/materialized/alter/rewrite/AlterMaterializedViewRewriteAnalyzer.java: ## @@ -68,10 +68,12 @@ public void analyzeInternal(ASTNode root) throws SemanticException { Table materializedViewTable = getTable(tableName, true); // One last test: if we are enabling the rewrite, we need to check that query -// only uses transactional (MM and ACID) tables +// only uses transactional (MM and ACID and Iceberg) tables if (rewriteEnable) { for (SourceTable sourceTable : materializedViewTable.getMVMetadata().getSourceTables()) { -if (!AcidUtils.isTransactionalTable(sourceTable.getTable())) { +Table table = new Table(sourceTable.getTable()); +if (!AcidUtils.isTransactionalTable(sourceTable.getTable()) && +!(table.isNonNative() && table.getStorageHandler().areSnapshotsSupported())) { Review Comment: Out of curiosity (and I don't know this code at all), what is the reason for the "isNonNative()" check? I guess there's a native table where "areSnapshotsSupported()" returns true? From the name, it sounds like this alone should have been enough. Issue Time Tracking --- Worklog Id: (was: 839767) Time Spent: 40m (was: 0.5h) > Alter materialized view enable rewrite throws SemanticException for source > iceberg table > > > Key: HIVE-26924 > URL: https://issues.apache.org/jira/browse/HIVE-26924 > Project: Hive > Issue Type: Bug > Components: Iceberg integration >Reporter: Dharmik Thakkar >Assignee: Krisztian Kasa >Priority: Critical > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > alter materialized view enable rewrite throws SemanticException for source > iceberg table > SQL test > {code:java} > >>> create materialized view mv_rewrite as select t, si from all100k where > >>> t>115; > >>> analyze table mv_rewrite compute statistics for columns; > >>> set hive.explain.user=false; > >>> explain select si,t from all100k where t>116 and t<120; > !!! match row_contains > alias: iceberg_test_db_hive.mv_rewrite > >>> alter materialized view mv_rewrite disable rewrite; > >>> explain select si,t from all100k where t>116 and t<120; > !!! match row_contains > alias: all100k > >>> alter materialized view mv_rewrite enable rewrite; > >>> explain select si,t from all100k where t>116 and t<120; > !!! match row_contains > alias: iceberg_test_db_hive.mv_rewrite > >>> drop materialized view mv_rewrite; {code} > > Error > {code:java} > 2023-01-10T18:40:34,303 INFO [pool-3-thread-1] jdbc.TestDriver: Query: alter > materialized view mv_rewrite enable rewrite > 2023-01-10T18:40:34,365 INFO [Thread-10] jdbc.TestDriver: INFO : Compiling > command(queryId=hive_20230110184034_f557b4a6-40a0-42ba-8e67-2f273f50af36): > alter materialized view mv_rewrite enable rewrite > 2023-01-10T18:40:34,426 INFO [Thread-10] jdbc.TestDriver: ERROR : FAILED: > SemanticException Automatic rewriting for materialized view cannot be enabled > if the materialized view uses non-transactional tables > 2023-01-10T18:40:34,426 INFO [Thread-10] jdbc.TestDriver: > org.apache.hadoop.hive.ql.parse.SemanticException: Automatic rewriting for > materialized view cannot be enabled if the materialized view uses > non-transactional tables > 2023-01-10T18:40:34,426 INFO [Thread-10] jdbc.TestDriver: at > org.apache.hadoop.hive.ql.ddl.view.materialized.alter.rewrite.AlterMaterializedViewRewriteAnalyzer.analyzeInternal(AlterMaterializedViewRewriteAnalyzer.java:75) > 2023-01-10T18:40:34,426 INFO [Thread-10] jdbc.TestDriver: at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:313) > 2023-01-10T18:40:34,427 INFO [Thread-10] jdbc.TestDriver: at > org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:222) > 2023-01-10T18:40:34,427 INFO [Thread-10] jdbc.TestDriver: at > org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:105) > 2023-01-10T18:40:34,427 INFO [Thread-10] jdbc.TestDriver: at > org.apache.hadoop.hive.ql.Driver.compile(Driver.java:201) > 2023-01-10T18:40:34,427 INFO [Thread-10] jdbc.TestDriver: at >
[jira] [Work logged] (HIVE-22977) Merge delta files instead of running a query in major/minor compaction
[ https://issues.apache.org/jira/browse/HIVE-22977?focusedWorklogId=839766=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839766 ] ASF GitHub Bot logged work on HIVE-22977: - Author: ASF GitHub Bot Created on: 17/Jan/23 21:36 Start Date: 17/Jan/23 21:36 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #3801: URL: https://github.com/apache/hive/pull/3801#issuecomment-1386083177 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive=3801) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3801=false=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3801=false=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3801=false=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=CODE_SMELL) [2 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive=3801=coverage=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive=3801=duplicated_lines_density=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 839766) Time Spent: 4h 40m (was: 4.5h) > Merge delta files instead of running a query in major/minor compaction > -- > > Key: HIVE-22977 > URL: https://issues.apache.org/jira/browse/HIVE-22977 > Project: Hive > Issue Type: Improvement >Reporter: László Pintér >Assignee: Sourabh Badhya >Priority: Major > Labels: pull-request-available > Attachments: HIVE-22977.01.patch, HIVE-22977.02.patch > > Time Spent: 4h 40m > Remaining Estimate: 0h > > [Compaction Optimiziation] > We should analyse the possibility to move a delta file instead of running a > major/minor compaction query. > Please consider the following use cases: > - full acid table but only insert queries were run. This means that no > delete delta directories were created. Is it possible to merge the delta > directory contents without running a compaction query? > - full acid table, initiating queries through the streaming API. If there > are no abort transactions during the streaming, is it possible to merge the > delta directory contents without running a compaction query? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26922) Deadlock when rebuilding Materialized view stored by Iceberg
[ https://issues.apache.org/jira/browse/HIVE-26922?focusedWorklogId=839765=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839765 ] ASF GitHub Bot logged work on HIVE-26922: - Author: ASF GitHub Bot Created on: 17/Jan/23 21:31 Start Date: 17/Jan/23 21:31 Worklog Time Spent: 10m Work Description: scarlin-cloudera commented on code in PR #3934: URL: https://github.com/apache/hive/pull/3934#discussion_r1072847233 ## ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java: ## @@ -3122,7 +3117,19 @@ Seems much cleaner if each stmt is identified as a particular HiveOperation (whi } return lockComponents; } - + + private static LockType getLockTypeFromStorageHandler(WriteEntity output, Table t) { +final HiveStorageHandler storageHandler = Preconditions.checkNotNull(t.getStorageHandler(), +"Non-native tables must have an instance of storage handler."); +LockType lockType = storageHandler.getLockType(output); +if (null == LockType.findByValue(lockType.getValue())) { + throw new IllegalArgumentException(String + .format("Lock type [%s] for Database.Table [%s.%s] is unknown", lockType, t.getDbName(), Review Comment: Optional nit: I see this was copied from somewhere else, but there really only has to be one argument here, t.getCompleteName() Issue Time Tracking --- Worklog Id: (was: 839765) Time Spent: 1h 20m (was: 1h 10m) > Deadlock when rebuilding Materialized view stored by Iceberg > > > Key: HIVE-26922 > URL: https://issues.apache.org/jira/browse/HIVE-26922 > Project: Hive > Issue Type: Bug >Reporter: Krisztian Kasa >Assignee: Krisztian Kasa >Priority: Major > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > {code} > create table tbl_ice(a int, b string, c int) stored by iceberg stored as orc > tblproperties ('format-version'='1'); > insert into tbl_ice values (1, 'one', 50), (2, 'two', 51), (3, 'three', 52), > (4, 'four', 53), (5, 'five', 54); > create materialized view mat1 stored by iceberg stored as orc tblproperties > ('format-version'='1') as > select tbl_ice.b, tbl_ice.c from tbl_ice where tbl_ice.c > 52; > insert into tbl_ice values (10, 'ten', 60); > alter materialized view mat1 rebuild; > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26925) MV with iceberg storage format fails when contains 'PARTITIONED ON' clause due to column number/types difference.
[ https://issues.apache.org/jira/browse/HIVE-26925?focusedWorklogId=839741=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839741 ] ASF GitHub Bot logged work on HIVE-26925: - Author: ASF GitHub Bot Created on: 17/Jan/23 18:31 Start Date: 17/Jan/23 18:31 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #3939: URL: https://github.com/apache/hive/pull/3939#issuecomment-1385854313 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive=3939) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=3939=false=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3939=false=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive=3939=false=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=3939=false=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3939=false=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=3939=false=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3939=false=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3939=false=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3939=false=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive=3939=false=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3939=false=CODE_SMELL) [5 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive=3939=false=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive=3939=coverage=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive=3939=duplicated_lines_density=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 839741) Time Spent: 1h 40m (was: 1.5h) > MV with iceberg storage format fails when contains 'PARTITIONED ON' clause > due to column number/types difference. > - > > Key: HIVE-26925 > URL: https://issues.apache.org/jira/browse/HIVE-26925 > Project: Hive > Issue Type: Bug > Components: Iceberg integration >Reporter: Dharmik Thakkar >Assignee: Krisztian Kasa >Priority: Critical > Labels: pull-request-available > Time Spent: 1h 40m > Remaining Estimate: 0h > > MV with iceberg storage format fails when contains 'PARTITIONED ON' clause > due to column number/types difference. > {code:java} > !!! annotations iceberg > >>> use iceberg_test_db_hive; > No rows affected > >>> set hive.exec.max.dynamic.partitions=2000; > >>> set hive.exec.max.dynamic.partitions.pernode=2000; > >>> drop materialized view if exists mv_agg_gby_col_partitioned; > >>> create materialized view mv_agg_gby_col_partitioned PARTITIONED ON (t) > >>> stored by iceberg stored as orc tblproperties ('format-version'='1') as > >>> select b,f,sum(b), sum(f),t from all100k group by b,f,v,c,t; > >>> analyze table mv_agg_gby_col_partitioned compute statistics for columns; > >>> set hive.explain.user=false; > >>> explain select b,f,sum(b) from all100k where t=93 group by c,v,f,b; > !!! match row_contains >
[jira] [Work logged] (HIVE-26928) LlapIoImpl::getParquetFooterBuffersFromCache throws exception when metadata cache is disabled
[ https://issues.apache.org/jira/browse/HIVE-26928?focusedWorklogId=839735=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839735 ] ASF GitHub Bot logged work on HIVE-26928: - Author: ASF GitHub Bot Created on: 17/Jan/23 18:17 Start Date: 17/Jan/23 18:17 Worklog Time Spent: 10m Work Description: simhadri-g opened a new pull request, #3962: URL: https://github.com/apache/hive/pull/3962 …tion when metadata cache is disabled ### What changes were proposed in this pull request? If metadata / LLAP cache is disabled (hive.llap.io.memory.mode=none) at the time of initializing the LLAP I/O on daemon startup results in NPE when trying to reading "iceberg + parquet" tables. ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? 1. Q file run using TestIcebergLlapLocalCliDriver. 2. manual test. Issue Time Tracking --- Worklog Id: (was: 839735) Remaining Estimate: 0h Time Spent: 10m > LlapIoImpl::getParquetFooterBuffersFromCache throws exception when metadata > cache is disabled > - > > Key: HIVE-26928 > URL: https://issues.apache.org/jira/browse/HIVE-26928 > Project: Hive > Issue Type: Improvement > Components: Iceberg integration >Reporter: Rajesh Balamohan >Assignee: Simhadri Govindappa >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > When metadata / LLAP cache is disabled, "iceberg + parquet" throws the > following error. "{color:#5a656d}hive.llap.io.memory.mode=none"{color} > It should check for "metadatacache" correctly or fix it in LlapIoImpl. > > {noformat} > Caused by: java.lang.NullPointerException: Metadata cache must not be null > at > com.google.common.base.Preconditions.checkNotNull(Preconditions.java:897) > at > org.apache.hadoop.hive.llap.io.api.impl.LlapIoImpl.getParquetFooterBuffersFromCache(LlapIoImpl.java:467) > at > org.apache.iceberg.mr.hive.vector.HiveVectorizedReader.parquetRecordReader(HiveVectorizedReader.java:227) > at > org.apache.iceberg.mr.hive.vector.HiveVectorizedReader.reader(HiveVectorizedReader.java:162) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > org.apache.iceberg.common.DynMethods$UnboundMethod.invokeChecked(DynMethods.java:65) > at > org.apache.iceberg.common.DynMethods$UnboundMethod.invoke(DynMethods.java:77) > at > org.apache.iceberg.common.DynMethods$StaticMethod.invoke(DynMethods.java:196) > at > org.apache.iceberg.mr.mapreduce.IcebergInputFormat$IcebergRecordReader.openVectorized(IcebergInputFormat.java:331) > at > org.apache.iceberg.mr.mapreduce.IcebergInputFormat$IcebergRecordReader.open(IcebergInputFormat.java:377) > at > org.apache.iceberg.mr.mapreduce.IcebergInputFormat$IcebergRecordReader.nextTask(IcebergInputFormat.java:270) > at > org.apache.iceberg.mr.mapreduce.IcebergInputFormat$IcebergRecordReader.initialize(IcebergInputFormat.java:266) > at > org.apache.iceberg.mr.mapred.AbstractMapredIcebergRecordReader.(AbstractMapredIcebergRecordReader.java:40) > at > org.apache.iceberg.mr.hive.vector.HiveIcebergVectorizedRecordReader.(HiveIcebergVectorizedRecordReader.java:41) > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-26928) LlapIoImpl::getParquetFooterBuffersFromCache throws exception when metadata cache is disabled
[ https://issues.apache.org/jira/browse/HIVE-26928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-26928: -- Labels: pull-request-available (was: ) > LlapIoImpl::getParquetFooterBuffersFromCache throws exception when metadata > cache is disabled > - > > Key: HIVE-26928 > URL: https://issues.apache.org/jira/browse/HIVE-26928 > Project: Hive > Issue Type: Improvement > Components: Iceberg integration >Reporter: Rajesh Balamohan >Assignee: Simhadri Govindappa >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > When metadata / LLAP cache is disabled, "iceberg + parquet" throws the > following error. "{color:#5a656d}hive.llap.io.memory.mode=none"{color} > It should check for "metadatacache" correctly or fix it in LlapIoImpl. > > {noformat} > Caused by: java.lang.NullPointerException: Metadata cache must not be null > at > com.google.common.base.Preconditions.checkNotNull(Preconditions.java:897) > at > org.apache.hadoop.hive.llap.io.api.impl.LlapIoImpl.getParquetFooterBuffersFromCache(LlapIoImpl.java:467) > at > org.apache.iceberg.mr.hive.vector.HiveVectorizedReader.parquetRecordReader(HiveVectorizedReader.java:227) > at > org.apache.iceberg.mr.hive.vector.HiveVectorizedReader.reader(HiveVectorizedReader.java:162) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > org.apache.iceberg.common.DynMethods$UnboundMethod.invokeChecked(DynMethods.java:65) > at > org.apache.iceberg.common.DynMethods$UnboundMethod.invoke(DynMethods.java:77) > at > org.apache.iceberg.common.DynMethods$StaticMethod.invoke(DynMethods.java:196) > at > org.apache.iceberg.mr.mapreduce.IcebergInputFormat$IcebergRecordReader.openVectorized(IcebergInputFormat.java:331) > at > org.apache.iceberg.mr.mapreduce.IcebergInputFormat$IcebergRecordReader.open(IcebergInputFormat.java:377) > at > org.apache.iceberg.mr.mapreduce.IcebergInputFormat$IcebergRecordReader.nextTask(IcebergInputFormat.java:270) > at > org.apache.iceberg.mr.mapreduce.IcebergInputFormat$IcebergRecordReader.initialize(IcebergInputFormat.java:266) > at > org.apache.iceberg.mr.mapred.AbstractMapredIcebergRecordReader.(AbstractMapredIcebergRecordReader.java:40) > at > org.apache.iceberg.mr.hive.vector.HiveIcebergVectorizedRecordReader.(HiveIcebergVectorizedRecordReader.java:41) > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-22173) Query with multiple lateral views hangs during compilation
[ https://issues.apache.org/jira/browse/HIVE-22173?focusedWorklogId=839732=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839732 ] ASF GitHub Bot logged work on HIVE-22173: - Author: ASF GitHub Bot Created on: 17/Jan/23 18:11 Start Date: 17/Jan/23 18:11 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #3852: URL: https://github.com/apache/hive/pull/3852#issuecomment-1385831457 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive=3852) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=3852=false=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3852=false=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive=3852=false=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=3852=false=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3852=false=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=3852=false=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3852=false=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3852=false=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3852=false=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive=3852=false=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3852=false=CODE_SMELL) [1 Code Smell](https://sonarcloud.io/project/issues?id=apache_hive=3852=false=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive=3852=coverage=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive=3852=duplicated_lines_density=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 839732) Time Spent: 1h 50m (was: 1h 40m) > Query with multiple lateral views hangs during compilation > -- > > Key: HIVE-22173 > URL: https://issues.apache.org/jira/browse/HIVE-22173 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 3.1.1, 4.0.0-alpha-1 > Environment: Hive-3.1.1, Java-8 >Reporter: Rajkumar Singh >Assignee: Stamatis Zampetakis >Priority: Critical > Labels: pull-request-available > Attachments: op_plan_4_lateral_views.pdf, thread-progress.log > > Time Spent: 1h 50m > Remaining Estimate: 0h > > Steps To Repro: > {code:java} > -- create table > CREATE EXTERNAL TABLE `jsontable`( > `json_string` string) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.mapred.TextInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' ; > -- Run explain of the query > explain SELECT > * > FROM jsontable > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.addr.city'), "\\[|\\]|\"", ""),',')) t1 as c1 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.addr.country'), "\\[|\\]|\"", ""),',')) t2 as c2 > lateral view >
[jira] [Work logged] (HIVE-26809) Upgrade ORC to 1.8.1
[ https://issues.apache.org/jira/browse/HIVE-26809?focusedWorklogId=839726=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839726 ] ASF GitHub Bot logged work on HIVE-26809: - Author: ASF GitHub Bot Created on: 17/Jan/23 17:46 Start Date: 17/Jan/23 17:46 Worklog Time Spent: 10m Work Description: difin commented on code in PR #3833: URL: https://github.com/apache/hive/pull/3833#discussion_r1072298943 ## ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedTreeReaderFactory.java: ## @@ -224,7 +237,252 @@ private static void skipCompressedIndex(boolean isCompressed, PositionProvider i index.getNext(); } - protected static class StringStreamReader extends StringTreeReader + public static class StringDictionaryTreeReaderHive extends TreeReader { Review Comment: Hi @ayushtkn, I agree with you. It is not ideal approach. Before implementing this approach I did try to adapt Hive, but I didn't succeed to find how Hive could be adapted to ORC-1060 changes because those changes are inside internal implementation of Orc StringDictionaryTreeReader class. The API of StringDictionaryTreeReader class remained the same. I agree with you that this approach is not ideal and will backfire in future when we try to upgrade and the changes in ORC depends on the ones which we ditched, but Hive already heavily depends on internal ORC API by implementing its own column readers on top of ORC and when upgrading to different ORC version it is often required to make adaptations in Hive. Issue Time Tracking --- Worklog Id: (was: 839726) Time Spent: 5.5h (was: 5h 20m) > Upgrade ORC to 1.8.1 > > > Key: HIVE-26809 > URL: https://issues.apache.org/jira/browse/HIVE-26809 > Project: Hive > Issue Type: Improvement >Affects Versions: 4.0.0 >Reporter: Dmitriy Fingerman >Assignee: Dmitriy Fingerman >Priority: Major > Labels: pull-request-available > Time Spent: 5.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26809) Upgrade ORC to 1.8.1
[ https://issues.apache.org/jira/browse/HIVE-26809?focusedWorklogId=839725=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839725 ] ASF GitHub Bot logged work on HIVE-26809: - Author: ASF GitHub Bot Created on: 17/Jan/23 17:44 Start Date: 17/Jan/23 17:44 Worklog Time Spent: 10m Work Description: difin commented on code in PR #3833: URL: https://github.com/apache/hive/pull/3833#discussion_r1072298943 ## ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedTreeReaderFactory.java: ## @@ -224,7 +237,252 @@ private static void skipCompressedIndex(boolean isCompressed, PositionProvider i index.getNext(); } - protected static class StringStreamReader extends StringTreeReader + public static class StringDictionaryTreeReaderHive extends TreeReader { Review Comment: Hi @ayushtkn, I agree with you. It is not ideal approach. Before implementing this approach I did try to adapt Hive, but I didn't succeed to find how Hive could be adapted to ORC-1060 changes because those changes are inside internal implementation of Orc StringDictionaryTreeReader class. I agree with you that this approach is not ideal and will backfire in future when we try to upgrade and the changes in ORC depends on the ones which we ditched, but Hive already heavily depends on internal ORC API by implementing its own column readers on top of ORC and when upgrading to different ORC version it is often required to make adaptations in Hive. Issue Time Tracking --- Worklog Id: (was: 839725) Time Spent: 5h 20m (was: 5h 10m) > Upgrade ORC to 1.8.1 > > > Key: HIVE-26809 > URL: https://issues.apache.org/jira/browse/HIVE-26809 > Project: Hive > Issue Type: Improvement >Affects Versions: 4.0.0 >Reporter: Dmitriy Fingerman >Assignee: Dmitriy Fingerman >Priority: Major > Labels: pull-request-available > Time Spent: 5h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26947) Hive compactor.Worker can respawn connections to HMS at extremely high frequency
[ https://issues.apache.org/jira/browse/HIVE-26947?focusedWorklogId=839722=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839722 ] ASF GitHub Bot logged work on HIVE-26947: - Author: ASF GitHub Bot Created on: 17/Jan/23 17:29 Start Date: 17/Jan/23 17:29 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #3955: URL: https://github.com/apache/hive/pull/3955#issuecomment-1385777432 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive=3955) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=3955=false=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3955=false=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive=3955=false=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=3955=false=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3955=false=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=3955=false=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3955=false=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3955=false=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3955=false=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive=3955=false=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3955=false=CODE_SMELL) [7 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive=3955=false=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive=3955=coverage=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive=3955=duplicated_lines_density=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 839722) Time Spent: 1h 10m (was: 1h) > Hive compactor.Worker can respawn connections to HMS at extremely high > frequency > > > Key: HIVE-26947 > URL: https://issues.apache.org/jira/browse/HIVE-26947 > Project: Hive > Issue Type: Bug >Reporter: Akshat Mathur >Assignee: Akshat Mathur >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > After catching the exception generated by the findNextCompactionAndExecute() > task, HS2 appears to immediately rerun the task with no delay or backoff. As > a result there are ~3500 connection attempts from HS2 to HMS over just a 5 > second period in the HS2 log > The compactor.Worker should wait between failed attempts and maybe do an > exponential backoff. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26717) Query based Rebalance compaction on insert-only tables
[ https://issues.apache.org/jira/browse/HIVE-26717?focusedWorklogId=839720=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839720 ] ASF GitHub Bot logged work on HIVE-26717: - Author: ASF GitHub Bot Created on: 17/Jan/23 17:24 Start Date: 17/Jan/23 17:24 Worklog Time Spent: 10m Work Description: deniskuzZ commented on code in PR #3935: URL: https://github.com/apache/hive/pull/3935#discussion_r1072518934 ## standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java: ## @@ -342,15 +342,17 @@ public CompactionInfo findNextToCompact(FindNextCompactRequest rqst) throws Meta public void markCompacted(CompactionInfo info) throws MetaException { try { Connection dbConn = null; - Statement stmt = null; + PreparedStatement pstmt = null; Review Comment: use try-with-resources since you refactored this method, it's reported in findbugs Issue Time Tracking --- Worklog Id: (was: 839720) Time Spent: 2h 20m (was: 2h 10m) > Query based Rebalance compaction on insert-only tables > -- > > Key: HIVE-26717 > URL: https://issues.apache.org/jira/browse/HIVE-26717 > Project: Hive > Issue Type: Sub-task > Components: Hive >Reporter: László Végh >Assignee: László Végh >Priority: Major > Labels: ACID, compaction, pull-request-available > Time Spent: 2h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26896) Update load_static_ptn_into_bucketed_table.q.out file
[ https://issues.apache.org/jira/browse/HIVE-26896?focusedWorklogId=839714=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839714 ] ASF GitHub Bot logged work on HIVE-26896: - Author: ASF GitHub Bot Created on: 17/Jan/23 17:05 Start Date: 17/Jan/23 17:05 Worklog Time Spent: 10m Work Description: zabetak closed pull request #3901: HIVE-26896 : Test fixes for lineage3.q and load_static_ptn_into_bucketed_table.q URL: https://github.com/apache/hive/pull/3901 Issue Time Tracking --- Worklog Id: (was: 839714) Time Spent: 1h 50m (was: 1h 40m) > Update load_static_ptn_into_bucketed_table.q.out file > - > > Key: HIVE-26896 > URL: https://issues.apache.org/jira/browse/HIVE-26896 > Project: Hive > Issue Type: Sub-task >Reporter: Aman Raj >Assignee: Aman Raj >Priority: Critical > Labels: pull-request-available > Fix For: 3.2.0 > > Time Spent: 1h 50m > Remaining Estimate: 0h > > These tests were fixed in branch-3.1 so backporting them to branch-3 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (HIVE-26896) Update load_static_ptn_into_bucketed_table.q.out file
[ https://issues.apache.org/jira/browse/HIVE-26896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stamatis Zampetakis resolved HIVE-26896. Fix Version/s: 3.2.0 Resolution: Fixed Fixed in https://github.com/apache/hive/commit/e5573b0e0d30f8c3042239cf0fda219b25fe075d. Thanks for the PR [~amanraj2520]! > Update load_static_ptn_into_bucketed_table.q.out file > - > > Key: HIVE-26896 > URL: https://issues.apache.org/jira/browse/HIVE-26896 > Project: Hive > Issue Type: Sub-task >Reporter: Aman Raj >Assignee: Aman Raj >Priority: Critical > Labels: pull-request-available > Fix For: 3.2.0 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > These tests were fixed in branch-3.1 so backporting them to branch-3 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-26896) Update load_static_ptn_into_bucketed_table.q.out file
[ https://issues.apache.org/jira/browse/HIVE-26896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stamatis Zampetakis updated HIVE-26896: --- Summary: Update load_static_ptn_into_bucketed_table.q.out file (was: Backport of Test fixes for lineage3.q and load_static_ptn_into_bucketed_table.q) > Update load_static_ptn_into_bucketed_table.q.out file > - > > Key: HIVE-26896 > URL: https://issues.apache.org/jira/browse/HIVE-26896 > Project: Hive > Issue Type: Sub-task >Reporter: Aman Raj >Assignee: Aman Raj >Priority: Critical > Labels: pull-request-available > Time Spent: 1h 40m > Remaining Estimate: 0h > > These tests were fixed in branch-3.1 so backporting them to branch-3 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-22173) Query with multiple lateral views hangs during compilation
[ https://issues.apache.org/jira/browse/HIVE-22173?focusedWorklogId=839710=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839710 ] ASF GitHub Bot logged work on HIVE-22173: - Author: ASF GitHub Bot Created on: 17/Jan/23 16:43 Start Date: 17/Jan/23 16:43 Worklog Time Spent: 10m Work Description: zabetak commented on PR #3852: URL: https://github.com/apache/hive/pull/3852#issuecomment-1385710968 @amansinha100 I addressed your comments can you please have another look. Thanks! Issue Time Tracking --- Worklog Id: (was: 839710) Time Spent: 1h 40m (was: 1.5h) > Query with multiple lateral views hangs during compilation > -- > > Key: HIVE-22173 > URL: https://issues.apache.org/jira/browse/HIVE-22173 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 3.1.1, 4.0.0-alpha-1 > Environment: Hive-3.1.1, Java-8 >Reporter: Rajkumar Singh >Assignee: Stamatis Zampetakis >Priority: Critical > Labels: pull-request-available > Attachments: op_plan_4_lateral_views.pdf, thread-progress.log > > Time Spent: 1h 40m > Remaining Estimate: 0h > > Steps To Repro: > {code:java} > -- create table > CREATE EXTERNAL TABLE `jsontable`( > `json_string` string) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.mapred.TextInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' ; > -- Run explain of the query > explain SELECT > * > FROM jsontable > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.addr.city'), "\\[|\\]|\"", ""),',')) t1 as c1 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.addr.country'), "\\[|\\]|\"", ""),',')) t2 as c2 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.addr'), "\\[|\\]|\"", ""),',')) t3 as c3 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.addr.postalCode'), "\\[|\\]|\"", ""),',')) t4 as c4 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.addr.state'), "\\[|\\]|\"", ""),',')) t5 as c5 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.addr.streetAddressLine'), "\\[|\\]|\"", ""),',')) t6 as c6 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.dummyfield'), "\\[|\\]|\"", ""),',')) t7 as c7 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.dummyfield'), "\\[|\\]|\"", ""),',')) t8 as c8 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.dummyfield.name.suffix'), "\\[|\\]|\"", ""),',')) t9 as c9 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.id.extension'), "\\[|\\]|\"", ""),',')) t10 as c10 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.id'), "\\[|\\]|\"", ""),',')) t11 as c11 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.id.root'), "\\[|\\]|\"", ""),',')) t12 as c12 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.telecom.'), "\\[|\\]|\"", ""),',')) t13 as c13 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.dummyfield1.use'), "\\[|\\]|\"", ""),',')) t14 as c14 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.dummyfield1.value'), "\\[|\\]|\"", ""),',')) t15 as c15 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield1.dummyfield1.code'), "\\[|\\]|\"", ""),',')) t16 as c16 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield1.dummyfield1.value'), "\\[|\\]|\"", ""),',')) t17 as c17 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield2.city'), "\\[|\\]|\"", ""),',')) t18 as c18 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield2.city'), "\\[|\\]|\"", ""),',')) t19 as c19 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield2.country'), "\\[|\\]|\"", ""),',')) t20 as c20 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield2.country'), "\\[|\\]|\"",
[jira] [Work logged] (HIVE-22173) Query with multiple lateral views hangs during compilation
[ https://issues.apache.org/jira/browse/HIVE-22173?focusedWorklogId=839708=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839708 ] ASF GitHub Bot logged work on HIVE-22173: - Author: ASF GitHub Bot Created on: 17/Jan/23 16:42 Start Date: 17/Jan/23 16:42 Worklog Time Spent: 10m Work Description: zabetak commented on PR #3852: URL: https://github.com/apache/hive/pull/3852#issuecomment-1385709912 > Also, the commit message mentions partition pruning but I didn't see changes related to that (I might have missed it). @amansinha100 The partition pruning optimization also relies on the present of the synthetic `IN (...)` predicates generated by `SyntheticJoinPredicate` transformation thus it is also affected by the changes here. For more details: https://github.com/apache/hive/blob/ad0ab58d9945b9a4727ab606f566e1d346bbd20b/ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java#L91 Issue Time Tracking --- Worklog Id: (was: 839708) Time Spent: 1.5h (was: 1h 20m) > Query with multiple lateral views hangs during compilation > -- > > Key: HIVE-22173 > URL: https://issues.apache.org/jira/browse/HIVE-22173 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 3.1.1, 4.0.0-alpha-1 > Environment: Hive-3.1.1, Java-8 >Reporter: Rajkumar Singh >Assignee: Stamatis Zampetakis >Priority: Critical > Labels: pull-request-available > Attachments: op_plan_4_lateral_views.pdf, thread-progress.log > > Time Spent: 1.5h > Remaining Estimate: 0h > > Steps To Repro: > {code:java} > -- create table > CREATE EXTERNAL TABLE `jsontable`( > `json_string` string) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.mapred.TextInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' ; > -- Run explain of the query > explain SELECT > * > FROM jsontable > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.addr.city'), "\\[|\\]|\"", ""),',')) t1 as c1 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.addr.country'), "\\[|\\]|\"", ""),',')) t2 as c2 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.addr'), "\\[|\\]|\"", ""),',')) t3 as c3 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.addr.postalCode'), "\\[|\\]|\"", ""),',')) t4 as c4 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.addr.state'), "\\[|\\]|\"", ""),',')) t5 as c5 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.addr.streetAddressLine'), "\\[|\\]|\"", ""),',')) t6 as c6 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.dummyfield'), "\\[|\\]|\"", ""),',')) t7 as c7 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.dummyfield'), "\\[|\\]|\"", ""),',')) t8 as c8 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.dummyfield.name.suffix'), "\\[|\\]|\"", ""),',')) t9 as c9 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.id.extension'), "\\[|\\]|\"", ""),',')) t10 as c10 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.id'), "\\[|\\]|\"", ""),',')) t11 as c11 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.id.root'), "\\[|\\]|\"", ""),',')) t12 as c12 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.telecom.'), "\\[|\\]|\"", ""),',')) t13 as c13 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.dummyfield1.use'), "\\[|\\]|\"", ""),',')) t14 as c14 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.dummyfield1.value'), "\\[|\\]|\"", ""),',')) t15 as c15 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield1.dummyfield1.code'), "\\[|\\]|\"", ""),',')) t16 as c16 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield1.dummyfield1.value'), "\\[|\\]|\"", ""),',')) t17 as c17 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield2.city'),
[jira] [Work logged] (HIVE-22173) Query with multiple lateral views hangs during compilation
[ https://issues.apache.org/jira/browse/HIVE-22173?focusedWorklogId=839707=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839707 ] ASF GitHub Bot logged work on HIVE-22173: - Author: ASF GitHub Bot Created on: 17/Jan/23 16:40 Start Date: 17/Jan/23 16:40 Worklog Time Spent: 10m Work Description: zabetak commented on code in PR #3852: URL: https://github.com/apache/hive/pull/3852#discussion_r1072444905 ## ql/src/test/results/clientpositive/llap/lvj_mapjoin.q.out: ## @@ -121,7 +121,6 @@ STAGE PLANS: TableScan alias: expod1 filterExpr: aid is not null (type: boolean) - probeDecodeDetails: cacheKey:HASH_MAP_MAPJOIN_39_container, bigKeyColName:aid, smallTablePos:1, keyRatio:1.0 Review Comment: The probe decode optimization relies on the presence of semijoins (https://github.com/apache/hive/blob/5f57814ed743a411c8fa7c647c24c98461271fe3/ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java#L1401). Semijoins depend on the `SyntheticJoinPredicate` transformation thus probe decode depends transitively on `SyntheticJoinPredicate`. This PR disables `SyntheticJoinPredicate` transformation for branches with lateral views (present in `lvj_mapjoin.q` test) thus semijoins are not considered and neither probe decode. Issue Time Tracking --- Worklog Id: (was: 839707) Time Spent: 1h 20m (was: 1h 10m) > Query with multiple lateral views hangs during compilation > -- > > Key: HIVE-22173 > URL: https://issues.apache.org/jira/browse/HIVE-22173 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 3.1.1, 4.0.0-alpha-1 > Environment: Hive-3.1.1, Java-8 >Reporter: Rajkumar Singh >Assignee: Stamatis Zampetakis >Priority: Critical > Labels: pull-request-available > Attachments: op_plan_4_lateral_views.pdf, thread-progress.log > > Time Spent: 1h 20m > Remaining Estimate: 0h > > Steps To Repro: > {code:java} > -- create table > CREATE EXTERNAL TABLE `jsontable`( > `json_string` string) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.mapred.TextInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' ; > -- Run explain of the query > explain SELECT > * > FROM jsontable > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.addr.city'), "\\[|\\]|\"", ""),',')) t1 as c1 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.addr.country'), "\\[|\\]|\"", ""),',')) t2 as c2 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.addr'), "\\[|\\]|\"", ""),',')) t3 as c3 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.addr.postalCode'), "\\[|\\]|\"", ""),',')) t4 as c4 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.addr.state'), "\\[|\\]|\"", ""),',')) t5 as c5 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.addr.streetAddressLine'), "\\[|\\]|\"", ""),',')) t6 as c6 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.dummyfield'), "\\[|\\]|\"", ""),',')) t7 as c7 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.dummyfield'), "\\[|\\]|\"", ""),',')) t8 as c8 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.dummyfield.name.suffix'), "\\[|\\]|\"", ""),',')) t9 as c9 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.id.extension'), "\\[|\\]|\"", ""),',')) t10 as c10 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.id'), "\\[|\\]|\"", ""),',')) t11 as c11 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.id.root'), "\\[|\\]|\"", ""),',')) t12 as c12 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.telecom.'), "\\[|\\]|\"", ""),',')) t13 as c13 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.dummyfield1.use'), "\\[|\\]|\"", ""),',')) t14 as c14 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.dummyfield1.value'), "\\[|\\]|\"", ""),',')) t15 as c15 > lateral view >
[jira] [Work logged] (HIVE-22173) Query with multiple lateral views hangs during compilation
[ https://issues.apache.org/jira/browse/HIVE-22173?focusedWorklogId=839705=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839705 ] ASF GitHub Bot logged work on HIVE-22173: - Author: ASF GitHub Bot Created on: 17/Jan/23 16:39 Start Date: 17/Jan/23 16:39 Worklog Time Spent: 10m Work Description: zabetak commented on code in PR #3852: URL: https://github.com/apache/hive/pull/3852#discussion_r1072443296 ## common/src/java/org/apache/hadoop/hive/conf/HiveConf.java: ## @@ -3710,7 +3710,12 @@ public static enum ConfVars { HIVE_EXPLAIN_USER("hive.explain.user", true, "Whether to show explain result at user level.\n" + "When enabled, will log EXPLAIN output for the query at user level. Tez only."), - +HIVE_EXPLAIN_VISIT_LIMIT("hive.explain.visit.limit", 256, new RangeValidator(1, Integer.MAX_VALUE), Review Comment: The limit applies only when doing EXPLAIN thus the choice of this name. Adding `node` in the property name is a good idea so I applied this change (https://github.com/apache/hive/pull/3852/commits/5c9933e1a59fe6b83638cfa62f9ce887c711). I opted to introduce a limit cause it is not possible to address the problem at the EXPLAIN level without changing the output format. There are many places where a graph is traversed in Hive and applying a global limit everywhere would be difficult to enforce. Moreover, it would possibly require changes in many places leading to a change with much bigger impact. If we want to go for a global visit limit then maybe it would be better to do it as a separate JIRA/PR. Issue Time Tracking --- Worklog Id: (was: 839705) Time Spent: 1h 10m (was: 1h) > Query with multiple lateral views hangs during compilation > -- > > Key: HIVE-22173 > URL: https://issues.apache.org/jira/browse/HIVE-22173 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 3.1.1, 4.0.0-alpha-1 > Environment: Hive-3.1.1, Java-8 >Reporter: Rajkumar Singh >Assignee: Stamatis Zampetakis >Priority: Critical > Labels: pull-request-available > Attachments: op_plan_4_lateral_views.pdf, thread-progress.log > > Time Spent: 1h 10m > Remaining Estimate: 0h > > Steps To Repro: > {code:java} > -- create table > CREATE EXTERNAL TABLE `jsontable`( > `json_string` string) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.mapred.TextInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' ; > -- Run explain of the query > explain SELECT > * > FROM jsontable > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.addr.city'), "\\[|\\]|\"", ""),',')) t1 as c1 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.addr.country'), "\\[|\\]|\"", ""),',')) t2 as c2 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.addr'), "\\[|\\]|\"", ""),',')) t3 as c3 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.addr.postalCode'), "\\[|\\]|\"", ""),',')) t4 as c4 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.addr.state'), "\\[|\\]|\"", ""),',')) t5 as c5 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.addr.streetAddressLine'), "\\[|\\]|\"", ""),',')) t6 as c6 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.dummyfield'), "\\[|\\]|\"", ""),',')) t7 as c7 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.dummyfield'), "\\[|\\]|\"", ""),',')) t8 as c8 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.dummyfield.name.suffix'), "\\[|\\]|\"", ""),',')) t9 as c9 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.id.extension'), "\\[|\\]|\"", ""),',')) t10 as c10 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.id'), "\\[|\\]|\"", ""),',')) t11 as c11 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.id.root'), "\\[|\\]|\"", ""),',')) t12 as c12 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.telecom.'), "\\[|\\]|\"", ""),',')) t13 as c13 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, >
[jira] [Work logged] (HIVE-22173) Query with multiple lateral views hangs during compilation
[ https://issues.apache.org/jira/browse/HIVE-22173?focusedWorklogId=839704=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839704 ] ASF GitHub Bot logged work on HIVE-22173: - Author: ASF GitHub Bot Created on: 17/Jan/23 16:37 Start Date: 17/Jan/23 16:37 Worklog Time Spent: 10m Work Description: zabetak commented on code in PR #3852: URL: https://github.com/apache/hive/pull/3852#discussion_r1072441255 ## ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java: ## @@ -94,8 +95,8 @@ public ParseContext transform(ParseContext pctx) throws SemanticException { // rule and passes the context along SyntheticContext context = new SyntheticContext(pctx); SemanticDispatcher disp = new DefaultRuleDispatcher(null, opRules, context); -SemanticGraphWalker ogw = new PreOrderOnceWalker(disp); - +PreOrderOnceWalker ogw = new PreOrderOnceWalker(disp); +ogw.excludeNode(LateralViewForwardOperator.class); Review Comment: Done (https://github.com/apache/hive/pull/3852/commits/e3c882083d7449efdc7c86ddfbf0c5e86e8c8d93) Issue Time Tracking --- Worklog Id: (was: 839704) Time Spent: 1h (was: 50m) > Query with multiple lateral views hangs during compilation > -- > > Key: HIVE-22173 > URL: https://issues.apache.org/jira/browse/HIVE-22173 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 3.1.1, 4.0.0-alpha-1 > Environment: Hive-3.1.1, Java-8 >Reporter: Rajkumar Singh >Assignee: Stamatis Zampetakis >Priority: Critical > Labels: pull-request-available > Attachments: op_plan_4_lateral_views.pdf, thread-progress.log > > Time Spent: 1h > Remaining Estimate: 0h > > Steps To Repro: > {code:java} > -- create table > CREATE EXTERNAL TABLE `jsontable`( > `json_string` string) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.mapred.TextInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' ; > -- Run explain of the query > explain SELECT > * > FROM jsontable > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.addr.city'), "\\[|\\]|\"", ""),',')) t1 as c1 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.addr.country'), "\\[|\\]|\"", ""),',')) t2 as c2 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.addr'), "\\[|\\]|\"", ""),',')) t3 as c3 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.addr.postalCode'), "\\[|\\]|\"", ""),',')) t4 as c4 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.addr.state'), "\\[|\\]|\"", ""),',')) t5 as c5 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.addr.streetAddressLine'), "\\[|\\]|\"", ""),',')) t6 as c6 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.dummyfield'), "\\[|\\]|\"", ""),',')) t7 as c7 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.dummyfield'), "\\[|\\]|\"", ""),',')) t8 as c8 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.dummyfield.name.suffix'), "\\[|\\]|\"", ""),',')) t9 as c9 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.id.extension'), "\\[|\\]|\"", ""),',')) t10 as c10 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.id'), "\\[|\\]|\"", ""),',')) t11 as c11 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.id.root'), "\\[|\\]|\"", ""),',')) t12 as c12 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.telecom.'), "\\[|\\]|\"", ""),',')) t13 as c13 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.dummyfield1.use'), "\\[|\\]|\"", ""),',')) t14 as c14 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield.dummyfield1.value'), "\\[|\\]|\"", ""),',')) t15 as c15 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield1.dummyfield1.code'), "\\[|\\]|\"", ""),',')) t16 as c16 > lateral view > explode(split(regexp_replace(get_json_object(jsontable.json_string, > '$.jsonfield1.dummyfield1.value'), "\\[|\\]|\"", ""),','))
[jira] [Work logged] (HIVE-26925) MV with iceberg storage format fails when contains 'PARTITIONED ON' clause due to column number/types difference.
[ https://issues.apache.org/jira/browse/HIVE-26925?focusedWorklogId=839703=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839703 ] ASF GitHub Bot logged work on HIVE-26925: - Author: ASF GitHub Bot Created on: 17/Jan/23 16:37 Start Date: 17/Jan/23 16:37 Worklog Time Spent: 10m Work Description: kasakrisz commented on code in PR #3939: URL: https://github.com/apache/hive/pull/3939#discussion_r1072440879 ## iceberg/iceberg-handler/src/test/queries/positive/mv_iceberg_partitioned_orc.q: ## @@ -0,0 +1,16 @@ + Issue Time Tracking --- Worklog Id: (was: 839703) Time Spent: 1.5h (was: 1h 20m) > MV with iceberg storage format fails when contains 'PARTITIONED ON' clause > due to column number/types difference. > - > > Key: HIVE-26925 > URL: https://issues.apache.org/jira/browse/HIVE-26925 > Project: Hive > Issue Type: Bug > Components: Iceberg integration >Reporter: Dharmik Thakkar >Assignee: Krisztian Kasa >Priority: Critical > Labels: pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > > MV with iceberg storage format fails when contains 'PARTITIONED ON' clause > due to column number/types difference. > {code:java} > !!! annotations iceberg > >>> use iceberg_test_db_hive; > No rows affected > >>> set hive.exec.max.dynamic.partitions=2000; > >>> set hive.exec.max.dynamic.partitions.pernode=2000; > >>> drop materialized view if exists mv_agg_gby_col_partitioned; > >>> create materialized view mv_agg_gby_col_partitioned PARTITIONED ON (t) > >>> stored by iceberg stored as orc tblproperties ('format-version'='1') as > >>> select b,f,sum(b), sum(f),t from all100k group by b,f,v,c,t; > >>> analyze table mv_agg_gby_col_partitioned compute statistics for columns; > >>> set hive.explain.user=false; > >>> explain select b,f,sum(b) from all100k where t=93 group by c,v,f,b; > !!! match row_contains > alias: iceberg_test_db_hive.mv_agg_gby_col_partitioned > >>> drop materialized view mv_agg_gby_col_partitioned; > {code} > Error > {code:java} > 2023-01-10T20:31:17,514 INFO [pool-5-thread-1] jdbc.TestDriver: Query: > create materialized view mv_agg_gby_col_partitioned PARTITIONED ON (t) stored > by iceberg stored as orc tblproperties ('format-version'='1') as select > b,f,sum(b), sum(f),t from all100k group by b,f,v,c,t > 2023-01-10T20:31:18,099 INFO [Thread-21] jdbc.TestDriver: INFO : Compiling > command(queryId=hive_20230110203117_6c333b6a-1642-40e7-80bc-e78dede47980): > create materialized view mv_agg_gby_col_partitioned PARTITIONED ON (t) stored > by iceberg stored as orc tblproperties ('format-version'='1') as select > b,f,sum(b), sum(f),t from all100k group by b,f,v,c,t > 2023-01-10T20:31:18,100 INFO [Thread-21] jdbc.TestDriver: INFO : No Stats > for iceberg_test_db_hive@all100k, Columns: b, c, t, f, v > 2023-01-10T20:31:18,100 INFO [Thread-21] jdbc.TestDriver: ERROR : FAILED: > SemanticException Line 0:-1 Cannot insert into target table because column > number/types are different 'TOK_TMP_FILE': Table insclause-0 has 6 columns, > but query has 5 columns. > 2023-01-10T20:31:18,100 INFO [Thread-21] jdbc.TestDriver: > org.apache.hadoop.hive.ql.parse.SemanticException: Line 0:-1 Cannot insert > into target table because column number/types are different 'TOK_TMP_FILE': > Table insclause-0 has 6 columns, but query has 5 columns. > 2023-01-10T20:31:18,100 INFO [Thread-21] jdbc.TestDriver: at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genConversionSelectOperator(SemanticAnalyzer.java:8905) > 2023-01-10T20:31:18,100 INFO [Thread-21] jdbc.TestDriver: at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFileSinkPlan(SemanticAnalyzer.java:8114) > 2023-01-10T20:31:18,100 INFO [Thread-21] jdbc.TestDriver: at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:11583) > 2023-01-10T20:31:18,100 INFO [Thread-21] jdbc.TestDriver: at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:11455) > 2023-01-10T20:31:18,100 INFO [Thread-21] jdbc.TestDriver: at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:12424) > 2023-01-10T20:31:18,100 INFO [Thread-21] jdbc.TestDriver: at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:12290) > 2023-01-10T20:31:18,100 INFO [Thread-21] jdbc.TestDriver: at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:13038) > 2023-01-10T20:31:18,100 INFO [Thread-21] jdbc.TestDriver: at >
[jira] [Work logged] (HIVE-26956) Improv find_in_set function
[ https://issues.apache.org/jira/browse/HIVE-26956?focusedWorklogId=839691=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839691 ] ASF GitHub Bot logged work on HIVE-26956: - Author: ASF GitHub Bot Created on: 17/Jan/23 16:00 Start Date: 17/Jan/23 16:00 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #3961: URL: https://github.com/apache/hive/pull/3961#issuecomment-1385647100 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive=3961) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=3961=false=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3961=false=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive=3961=false=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=3961=false=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3961=false=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=3961=false=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3961=false=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3961=false=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3961=false=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive=3961=false=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3961=false=CODE_SMELL) [0 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive=3961=false=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive=3961=coverage=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive=3961=duplicated_lines_density=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 839691) Time Spent: 20m (was: 10m) > Improv find_in_set function > --- > > Key: HIVE-26956 > URL: https://issues.apache.org/jira/browse/HIVE-26956 > Project: Hive > Issue Type: Improvement >Reporter: Bingye Chen >Assignee: Bingye Chen >Priority: Minor > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Improv find_in_set function -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26925) MV with iceberg storage format fails when contains 'PARTITIONED ON' clause due to column number/types difference.
[ https://issues.apache.org/jira/browse/HIVE-26925?focusedWorklogId=839679=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839679 ] ASF GitHub Bot logged work on HIVE-26925: - Author: ASF GitHub Bot Created on: 17/Jan/23 15:16 Start Date: 17/Jan/23 15:16 Worklog Time Spent: 10m Work Description: amansinha100 commented on code in PR #3939: URL: https://github.com/apache/hive/pull/3939#discussion_r1072338247 ## iceberg/iceberg-handler/src/test/queries/positive/mv_iceberg_partitioned_orc.q: ## @@ -0,0 +1,16 @@ + Issue Time Tracking --- Worklog Id: (was: 839679) Time Spent: 1h 20m (was: 1h 10m) > MV with iceberg storage format fails when contains 'PARTITIONED ON' clause > due to column number/types difference. > - > > Key: HIVE-26925 > URL: https://issues.apache.org/jira/browse/HIVE-26925 > Project: Hive > Issue Type: Bug > Components: Iceberg integration >Reporter: Dharmik Thakkar >Assignee: Krisztian Kasa >Priority: Critical > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > MV with iceberg storage format fails when contains 'PARTITIONED ON' clause > due to column number/types difference. > {code:java} > !!! annotations iceberg > >>> use iceberg_test_db_hive; > No rows affected > >>> set hive.exec.max.dynamic.partitions=2000; > >>> set hive.exec.max.dynamic.partitions.pernode=2000; > >>> drop materialized view if exists mv_agg_gby_col_partitioned; > >>> create materialized view mv_agg_gby_col_partitioned PARTITIONED ON (t) > >>> stored by iceberg stored as orc tblproperties ('format-version'='1') as > >>> select b,f,sum(b), sum(f),t from all100k group by b,f,v,c,t; > >>> analyze table mv_agg_gby_col_partitioned compute statistics for columns; > >>> set hive.explain.user=false; > >>> explain select b,f,sum(b) from all100k where t=93 group by c,v,f,b; > !!! match row_contains > alias: iceberg_test_db_hive.mv_agg_gby_col_partitioned > >>> drop materialized view mv_agg_gby_col_partitioned; > {code} > Error > {code:java} > 2023-01-10T20:31:17,514 INFO [pool-5-thread-1] jdbc.TestDriver: Query: > create materialized view mv_agg_gby_col_partitioned PARTITIONED ON (t) stored > by iceberg stored as orc tblproperties ('format-version'='1') as select > b,f,sum(b), sum(f),t from all100k group by b,f,v,c,t > 2023-01-10T20:31:18,099 INFO [Thread-21] jdbc.TestDriver: INFO : Compiling > command(queryId=hive_20230110203117_6c333b6a-1642-40e7-80bc-e78dede47980): > create materialized view mv_agg_gby_col_partitioned PARTITIONED ON (t) stored > by iceberg stored as orc tblproperties ('format-version'='1') as select > b,f,sum(b), sum(f),t from all100k group by b,f,v,c,t > 2023-01-10T20:31:18,100 INFO [Thread-21] jdbc.TestDriver: INFO : No Stats > for iceberg_test_db_hive@all100k, Columns: b, c, t, f, v > 2023-01-10T20:31:18,100 INFO [Thread-21] jdbc.TestDriver: ERROR : FAILED: > SemanticException Line 0:-1 Cannot insert into target table because column > number/types are different 'TOK_TMP_FILE': Table insclause-0 has 6 columns, > but query has 5 columns. > 2023-01-10T20:31:18,100 INFO [Thread-21] jdbc.TestDriver: > org.apache.hadoop.hive.ql.parse.SemanticException: Line 0:-1 Cannot insert > into target table because column number/types are different 'TOK_TMP_FILE': > Table insclause-0 has 6 columns, but query has 5 columns. > 2023-01-10T20:31:18,100 INFO [Thread-21] jdbc.TestDriver: at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genConversionSelectOperator(SemanticAnalyzer.java:8905) > 2023-01-10T20:31:18,100 INFO [Thread-21] jdbc.TestDriver: at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFileSinkPlan(SemanticAnalyzer.java:8114) > 2023-01-10T20:31:18,100 INFO [Thread-21] jdbc.TestDriver: at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:11583) > 2023-01-10T20:31:18,100 INFO [Thread-21] jdbc.TestDriver: at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:11455) > 2023-01-10T20:31:18,100 INFO [Thread-21] jdbc.TestDriver: at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:12424) > 2023-01-10T20:31:18,100 INFO [Thread-21] jdbc.TestDriver: at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:12290) > 2023-01-10T20:31:18,100 INFO [Thread-21] jdbc.TestDriver: at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:13038) > 2023-01-10T20:31:18,100 INFO [Thread-21] jdbc.TestDriver: at >
[jira] [Work logged] (HIVE-26809) Upgrade ORC to 1.8.1
[ https://issues.apache.org/jira/browse/HIVE-26809?focusedWorklogId=839669=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839669 ] ASF GitHub Bot logged work on HIVE-26809: - Author: ASF GitHub Bot Created on: 17/Jan/23 14:45 Start Date: 17/Jan/23 14:45 Worklog Time Spent: 10m Work Description: difin commented on code in PR #3833: URL: https://github.com/apache/hive/pull/3833#discussion_r1072298943 ## ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedTreeReaderFactory.java: ## @@ -224,7 +237,252 @@ private static void skipCompressedIndex(boolean isCompressed, PositionProvider i index.getNext(); } - protected static class StringStreamReader extends StringTreeReader + public static class StringDictionaryTreeReaderHive extends TreeReader { Review Comment: Hi @ayushtkn, I agree with you. It is not ideal approach. Before implementing this approach I did try to adapt Hive, but I didn't succeed to find how Hive could be adapted to ORC-1060 changes because those changes are only inside internal implementation of Orc StringDictionaryTreeReader class. I agree with you that this approach is not ideal and will backfire in future when we try to upgrade and the changes in ORC depends on the ones which we ditched, but Hive already heavily depends on internal ORC API by implementing its own column readers on top of ORC and when upgrading to different ORC version it is often required to make adaptations in Hive. Issue Time Tracking --- Worklog Id: (was: 839669) Time Spent: 5h 10m (was: 5h) > Upgrade ORC to 1.8.1 > > > Key: HIVE-26809 > URL: https://issues.apache.org/jira/browse/HIVE-26809 > Project: Hive > Issue Type: Improvement >Affects Versions: 4.0.0 >Reporter: Dmitriy Fingerman >Assignee: Dmitriy Fingerman >Priority: Major > Labels: pull-request-available > Time Spent: 5h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26717) Query based Rebalance compaction on insert-only tables
[ https://issues.apache.org/jira/browse/HIVE-26717?focusedWorklogId=839663=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839663 ] ASF GitHub Bot logged work on HIVE-26717: - Author: ASF GitHub Bot Created on: 17/Jan/23 14:28 Start Date: 17/Jan/23 14:28 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #3935: URL: https://github.com/apache/hive/pull/3935#issuecomment-1385511656 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive=3935) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=3935=false=BUG) [![E](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/E-16px.png 'E')](https://sonarcloud.io/project/issues?id=apache_hive=3935=false=BUG) [1 Bug](https://sonarcloud.io/project/issues?id=apache_hive=3935=false=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=3935=false=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3935=false=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=3935=false=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3935=false=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3935=false=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3935=false=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive=3935=false=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3935=false=CODE_SMELL) [5 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive=3935=false=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive=3935=coverage=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive=3935=duplicated_lines_density=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 839663) Time Spent: 2h 10m (was: 2h) > Query based Rebalance compaction on insert-only tables > -- > > Key: HIVE-26717 > URL: https://issues.apache.org/jira/browse/HIVE-26717 > Project: Hive > Issue Type: Sub-task > Components: Hive >Reporter: László Végh >Assignee: László Végh >Priority: Major > Labels: ACID, compaction, pull-request-available > Time Spent: 2h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HIVE-26957) Add convertCharset(s, from, to) function
[ https://issues.apache.org/jira/browse/HIVE-26957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bingye Chen reassigned HIVE-26957: -- > Add convertCharset(s, from, to) function > > > Key: HIVE-26957 > URL: https://issues.apache.org/jira/browse/HIVE-26957 > Project: Hive > Issue Type: New Feature >Reporter: Bingye Chen >Assignee: Bingye Chen >Priority: Minor > > Add convertCharset(s, from, to) function. > The function converts the string `s` from the `from` charset to the `to` > charset.It is already implemented in clickhouse. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26956) Improv find_in_set function
[ https://issues.apache.org/jira/browse/HIVE-26956?focusedWorklogId=839630=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839630 ] ASF GitHub Bot logged work on HIVE-26956: - Author: ASF GitHub Bot Created on: 17/Jan/23 13:30 Start Date: 17/Jan/23 13:30 Worklog Time Spent: 10m Work Description: TaoZex opened a new pull request, #3961: URL: https://github.com/apache/hive/pull/3961 ### What changes were proposed in this pull request? Improv find_in_set function ### Why are the changes needed? Code redundancy ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? Issue Time Tracking --- Worklog Id: (was: 839630) Remaining Estimate: 0h Time Spent: 10m > Improv find_in_set function > --- > > Key: HIVE-26956 > URL: https://issues.apache.org/jira/browse/HIVE-26956 > Project: Hive > Issue Type: Improvement >Reporter: Bingye Chen >Assignee: Bingye Chen >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > Improv find_in_set function -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26933) Cleanup dump directory for eventId which was failed in previous dump cycle
[ https://issues.apache.org/jira/browse/HIVE-26933?focusedWorklogId=839631=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839631 ] ASF GitHub Bot logged work on HIVE-26933: - Author: ASF GitHub Bot Created on: 17/Jan/23 13:30 Start Date: 17/Jan/23 13:30 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #3960: URL: https://github.com/apache/hive/pull/3960#issuecomment-1385428005 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive=3960) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=3960=false=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3960=false=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive=3960=false=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=3960=false=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3960=false=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=3960=false=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3960=false=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3960=false=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3960=false=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive=3960=false=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3960=false=CODE_SMELL) [0 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive=3960=false=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive=3960=coverage=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive=3960=duplicated_lines_density=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 839631) Time Spent: 20m (was: 10m) > Cleanup dump directory for eventId which was failed in previous dump cycle > -- > > Key: HIVE-26933 > URL: https://issues.apache.org/jira/browse/HIVE-26933 > Project: Hive > Issue Type: Improvement >Reporter: Harshal Patel >Assignee: Harshal Patel >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > # If Incremental Dump operation failes while dumping any event id in the > staging directory. Then dump directory for this event id along with file > _dumpmetadata still exists in the dump location. which is getting stored in > _events_dump file > # When user triggers dump operation for this policy again, It again resumes > dumping from failed event id, and tries to dump it again but as that event id > directory already created in previous cycle, it fails with the exception > {noformat} > [Scheduled Query Executor(schedule:repl_policytest7, execution_id:7181)]: > FAILED: Execution Error, return code 4 from > org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask. > org.apache.hadoop.fs.FileAlreadyExistsException: > /warehouse/tablespace/staging/policytest7/dGVzdDc=/14bcf976-662b-4237-b5bb-e7d63a1d089f/hive/137961/_dumpmetadata > for client 172.27.182.5 already exists > at >
[jira] [Updated] (HIVE-26956) Improv find_in_set function
[ https://issues.apache.org/jira/browse/HIVE-26956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-26956: -- Labels: pull-request-available (was: ) > Improv find_in_set function > --- > > Key: HIVE-26956 > URL: https://issues.apache.org/jira/browse/HIVE-26956 > Project: Hive > Issue Type: Improvement >Reporter: Bingye Chen >Assignee: Bingye Chen >Priority: Minor > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Improv find_in_set function -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HIVE-26956) Improv find_in_set function
[ https://issues.apache.org/jira/browse/HIVE-26956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bingye Chen reassigned HIVE-26956: -- > Improv find_in_set function > --- > > Key: HIVE-26956 > URL: https://issues.apache.org/jira/browse/HIVE-26956 > Project: Hive > Issue Type: Improvement >Reporter: Bingye Chen >Assignee: Bingye Chen >Priority: Minor > > Improv find_in_set function -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26802) Create qtest running QB compaction queries
[ https://issues.apache.org/jira/browse/HIVE-26802?focusedWorklogId=839596=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839596 ] ASF GitHub Bot logged work on HIVE-26802: - Author: ASF GitHub Bot Created on: 17/Jan/23 12:12 Start Date: 17/Jan/23 12:12 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #3882: URL: https://github.com/apache/hive/pull/3882#issuecomment-1385333728 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive=3882) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=3882=false=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3882=false=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive=3882=false=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=3882=false=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3882=false=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=3882=false=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3882=false=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3882=false=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3882=false=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive=3882=false=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3882=false=CODE_SMELL) [4 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive=3882=false=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive=3882=coverage=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive=3882=duplicated_lines_density=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 839596) Time Spent: 4h 50m (was: 4h 40m) > Create qtest running QB compaction queries > -- > > Key: HIVE-26802 > URL: https://issues.apache.org/jira/browse/HIVE-26802 > Project: Hive > Issue Type: Improvement >Reporter: Zoltán Rátkai >Assignee: Zoltán Rátkai >Priority: Minor > Labels: pull-request-available > Time Spent: 4h 50m > Remaining Estimate: 0h > > Create a qtest that runs the queries that query-based compaction runs. > Not so much to check for correct data but more to check the query plans, to > simplify tracing changes in compilation that might affect QB compaction. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26804) Cancel Compactions in initiated state
[ https://issues.apache.org/jira/browse/HIVE-26804?focusedWorklogId=839589=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839589 ] ASF GitHub Bot logged work on HIVE-26804: - Author: ASF GitHub Bot Created on: 17/Jan/23 11:25 Start Date: 17/Jan/23 11:25 Worklog Time Spent: 10m Work Description: veghlaci05 commented on code in PR #3880: URL: https://github.com/apache/hive/pull/3880#discussion_r1072059615 ## standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java: ## @@ -6242,4 +6245,91 @@ public boolean isWrapperFor(Class iface) throws SQLException { } } + @Override + @RetrySemantics.SafeToRetry + public AbortCompactResponse abortCompactions(AbortCompactionRequest reqst) throws MetaException, NoSuchCompactionException { + +AbortCompactResponse response = new AbortCompactResponse(new HashMap<>()); +response.setAbortedcompacts(abortCompactionResponseElements); +List compactionIdsToAbort = reqst.getCompactionIds(); +if (compactionIdsToAbort.isEmpty()) { + LOG.info("Compaction ids are missing in request. No compactions to abort"); + throw new NoSuchCompactionException("Compaction ids missing in request. No compactions to abort"); +} +reqst.getCompactionIds().forEach(x -> { + abortCompactionResponseElements.put(x, new AbortCompactionResponseElement(x, "Error", "Not Eligible")); +}); +List eligibleCompactionsToAbort = findEligibleCompactionsToAbort(compactionIdsToAbort); +for (int x = 0; x < eligibleCompactionsToAbort.size(); x++) { + abortCompaction(eligibleCompactionsToAbort.get(x)); +} +return response; + } + + private void addAbortCompactionResponse(long id, String message, String status) { +abortCompactionResponseElements.put(id, new AbortCompactionResponseElement(id, status, message)); + } + + @RetrySemantics.SafeToRetry + public void abortCompaction(CompactionInfo compactionInfo) throws MetaException { +try { + try (Connection dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED, connPoolMutex); + PreparedStatement pStmt = dbConn.prepareStatement(TxnQueries.INSERT_INTO_COMPLETED_COMPACTION)) { +CompactionInfo.insertIntoCompletedCompactions(pStmt, compactionInfo, getDbTime(dbConn)); +int updCount = pStmt.executeUpdate(); +if (updCount != 1) { + LOG.error("Unable to update compaction record: {}. updCnt={}", compactionInfo, updCount); + dbConn.rollback(); Review Comment: addAbortCompactionResponse() should be called here as well, stating that the compaction request could not be idnetified. ## standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java: ## @@ -6242,4 +6245,91 @@ public boolean isWrapperFor(Class iface) throws SQLException { } } + @Override + @RetrySemantics.SafeToRetry + public AbortCompactResponse abortCompactions(AbortCompactionRequest reqst) throws MetaException, NoSuchCompactionException { + +AbortCompactResponse response = new AbortCompactResponse(new HashMap<>()); +response.setAbortedcompacts(abortCompactionResponseElements); +List compactionIdsToAbort = reqst.getCompactionIds(); +if (compactionIdsToAbort.isEmpty()) { + LOG.info("Compaction ids are missing in request. No compactions to abort"); + throw new NoSuchCompactionException("Compaction ids missing in request. No compactions to abort"); +} +reqst.getCompactionIds().forEach(x -> { + abortCompactionResponseElements.put(x, new AbortCompactionResponseElement(x, "Error", "Not Eligible")); +}); +List eligibleCompactionsToAbort = findEligibleCompactionsToAbort(compactionIdsToAbort); +for (int x = 0; x < eligibleCompactionsToAbort.size(); x++) { + abortCompaction(eligibleCompactionsToAbort.get(x)); +} +return response; + } + + private void addAbortCompactionResponse(long id, String message, String status) { +abortCompactionResponseElements.put(id, new AbortCompactionResponseElement(id, status, message)); + } + + @RetrySemantics.SafeToRetry + public void abortCompaction(CompactionInfo compactionInfo) throws MetaException { +try { + try (Connection dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED, connPoolMutex); + PreparedStatement pStmt = dbConn.prepareStatement(TxnQueries.INSERT_INTO_COMPLETED_COMPACTION)) { +CompactionInfo.insertIntoCompletedCompactions(pStmt, compactionInfo, getDbTime(dbConn)); +int updCount = pStmt.executeUpdate(); +if (updCount != 1) { + LOG.error("Unable to update compaction record: {}. updCnt={}", compactionInfo, updCount); + dbConn.rollback(); +} else { + LOG.debug("Inserted {} entries into
[jira] [Work logged] (HIVE-26952) set the value of metastore.storage.schema.reader.impl to org.apache.hadoop.hive.metastore.SerDeStorageSchemaReader as default
[ https://issues.apache.org/jira/browse/HIVE-26952?focusedWorklogId=839581=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839581 ] ASF GitHub Bot logged work on HIVE-26952: - Author: ASF GitHub Bot Created on: 17/Jan/23 11:07 Start Date: 17/Jan/23 11:07 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #3959: URL: https://github.com/apache/hive/pull/3959#issuecomment-1385259486 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive=3959) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=3959=false=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3959=false=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive=3959=false=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=3959=false=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3959=false=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=3959=false=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3959=false=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3959=false=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3959=false=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive=3959=false=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3959=false=CODE_SMELL) [0 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive=3959=false=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive=3959=coverage=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive=3959=duplicated_lines_density=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 839581) Time Spent: 20m (was: 10m) > set the value of metastore.storage.schema.reader.impl to > org.apache.hadoop.hive.metastore.SerDeStorageSchemaReader as default > -- > > Key: HIVE-26952 > URL: https://issues.apache.org/jira/browse/HIVE-26952 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Reporter: Taraka Rama Rao Lethavadla >Assignee: Taraka Rama Rao Lethavadla >Priority: Minor > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > With the default value of > > {code:java} > DefaultStorageSchemaReader.class.getName(){code} > > in the Metastore Config, *metastore.storage.schema.reader.impl* > below exception is thrown when trying to read Avro schema > {noformat} > Caused by: org.apache.hive.service.cli.HiveSQLException: MetaException > (message:java.lang.UnsupportedOperationException: Storage schema reading not > supported) > at > org.apache.hive.service.cli.operation.GetColumnsOperation.runInternal(GetColumnsOperation.java:213) > at org.apache.hive.service.cli.operation.Operation.run(Operation.java:247) > at > org.apache.hive.service.cli.session.HiveSessionImpl.getColumns(HiveSessionImpl.java:729) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at >
[jira] [Work logged] (HIVE-26804) Cancel Compactions in initiated state
[ https://issues.apache.org/jira/browse/HIVE-26804?focusedWorklogId=839570=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839570 ] ASF GitHub Bot logged work on HIVE-26804: - Author: ASF GitHub Bot Created on: 17/Jan/23 10:19 Start Date: 17/Jan/23 10:19 Worklog Time Spent: 10m Work Description: rkirtir commented on code in PR #3880: URL: https://github.com/apache/hive/pull/3880#discussion_r1072024074 ## standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java: ## @@ -6242,4 +6246,86 @@ public boolean isWrapperFor(Class iface) throws SQLException { } } + @Override + @RetrySemantics.SafeToRetry + public AbortCompactResponse abortCompactions(AbortCompactionRequest reqst) throws MetaException, NoSuchCompactionException { Review Comment: As compaction related other methods are in TxnHandler, I had put it in TxnHandler. Please suggest ## standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java: ## @@ -6242,4 +6246,86 @@ public boolean isWrapperFor(Class iface) throws SQLException { } } + @Override + @RetrySemantics.SafeToRetry + public AbortCompactResponse abortCompactions(AbortCompactionRequest reqst) throws MetaException, NoSuchCompactionException { +AbortCompactResponse response = new AbortCompactResponse(new ArrayList<>()); +List requestedCompId = reqst.getCompactionIds(); +if (requestedCompId.isEmpty()) { + LOG.info("Compaction ids missing in request. No compactions to abort"); + throw new NoSuchCompactionException("ompaction ids missing in request. No compactions to abort"); +} +List abortCompactionResponseElementList = new ArrayList<>(); +for (int i = 0; i < requestedCompId.size(); i++) { + AbortCompactionResponseElement responseEle = abortCompaction(requestedCompId.get(i)); + abortCompactionResponseElementList.add(responseEle); +} +response.setAbortedcompacts(abortCompactionResponseElementList); +return response; + } + + @RetrySemantics.SafeToRetry + public AbortCompactionResponseElement abortCompaction(Long compId) throws MetaException { +try { + AbortCompactionResponseElement responseEle = new AbortCompactionResponseElement(); + responseEle.setCompactionIds(compId); + try (Connection dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED, connPoolMutex)) { +Optional compactionInfo = getCompactionByCompId(dbConn, compId); +if (compactionInfo.isPresent()) { + try (PreparedStatement pStmt = dbConn.prepareStatement(TxnQueries.INSERT_INTO_COMPLETED_COMPACTION)) { +CompactionInfo ci = compactionInfo.get(); +ci.errorMessage = "Compaction aborted by user"; +ci.state = TxnStore.ABORTED_STATE; +CompactionInfo.insertIntoCompletedCompactions(pStmt, ci, getDbTime(dbConn)); +int updCount = pStmt.executeUpdate(); +if (updCount != 1) { + LOG.error("Unable to update compaction record: {}. updCnt={}", ci, updCount); Review Comment: fixed Issue Time Tracking --- Worklog Id: (was: 839570) Time Spent: 1.5h (was: 1h 20m) > Cancel Compactions in initiated state > - > > Key: HIVE-26804 > URL: https://issues.apache.org/jira/browse/HIVE-26804 > Project: Hive > Issue Type: New Feature > Components: Hive >Reporter: KIRTI RUGE >Assignee: KIRTI RUGE >Priority: Major > Labels: pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26804) Cancel Compactions in initiated state
[ https://issues.apache.org/jira/browse/HIVE-26804?focusedWorklogId=839568=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839568 ] ASF GitHub Bot logged work on HIVE-26804: - Author: ASF GitHub Bot Created on: 17/Jan/23 10:18 Start Date: 17/Jan/23 10:18 Worklog Time Spent: 10m Work Description: rkirtir commented on code in PR #3880: URL: https://github.com/apache/hive/pull/3880#discussion_r1072022879 ## standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java: ## @@ -6242,4 +6246,86 @@ public boolean isWrapperFor(Class iface) throws SQLException { } } + @Override + @RetrySemantics.SafeToRetry + public AbortCompactResponse abortCompactions(AbortCompactionRequest reqst) throws MetaException, NoSuchCompactionException { +AbortCompactResponse response = new AbortCompactResponse(new ArrayList<>()); +List requestedCompId = reqst.getCompactionIds(); +if (requestedCompId.isEmpty()) { + LOG.info("Compaction ids missing in request. No compactions to abort"); + throw new NoSuchCompactionException("ompaction ids missing in request. No compactions to abort"); +} +List abortCompactionResponseElementList = new ArrayList<>(); +for (int i = 0; i < requestedCompId.size(); i++) { + AbortCompactionResponseElement responseEle = abortCompaction(requestedCompId.get(i)); + abortCompactionResponseElementList.add(responseEle); +} +response.setAbortedcompacts(abortCompactionResponseElementList); +return response; + } + + @RetrySemantics.SafeToRetry + public AbortCompactionResponseElement abortCompaction(Long compId) throws MetaException { +try { + AbortCompactionResponseElement responseEle = new AbortCompactionResponseElement(); + responseEle.setCompactionIds(compId); + try (Connection dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED, connPoolMutex)) { +Optional compactionInfo = getCompactionByCompId(dbConn, compId); +if (compactionInfo.isPresent()) { + try (PreparedStatement pStmt = dbConn.prepareStatement(TxnQueries.INSERT_INTO_COMPLETED_COMPACTION)) { +CompactionInfo ci = compactionInfo.get(); +ci.errorMessage = "Compaction aborted by user"; +ci.state = TxnStore.ABORTED_STATE; +CompactionInfo.insertIntoCompletedCompactions(pStmt, ci, getDbTime(dbConn)); +int updCount = pStmt.executeUpdate(); +if (updCount != 1) { + LOG.error("Unable to update compaction record: {}. updCnt={}", ci, updCount); + dbConn.rollback(); +} +LOG.debug("Inserted {} entries into COMPLETED_COMPACTIONS", updCount); +try (PreparedStatement stmt = dbConn.prepareStatement("DELETE FROM \"COMPACTION_QUEUE\" WHERE \"CQ_ID\" = ?")) { + stmt.setLong(1, ci.id); + LOG.debug("Going to execute update on COMPACTION_QUEUE <{}>"); + updCount = stmt.executeUpdate(); + if (updCount != 1) { +LOG.error("Unable to update compaction record: {}. updCnt={}", ci, updCount); +dbConn.rollback(); + } else { +responseEle.setMessage("Successfully Aborted Compaction "); +responseEle.setStatus("Success"); +dbConn.commit(); + } +} + } +} else { + responseEle.setMessage("Compaction element not eligible for cancellation"); + responseEle.setStatus("Error"); +} + } catch (SQLException e) { +LOG.error("Failed to abort compaction request"); +checkRetryable(e, "abortCompaction(" + compId + ")"); +responseEle.setMessage("Error while aborting compaction"); +responseEle.setStatus("Error"); + } + return responseEle; +} catch (RetryException e) { + return abortCompaction(compId); +} + + } + + private Optional getCompactionByCompId(Connection dbConn, Long compId) throws SQLException, MetaException { Review Comment: fixed. Issue Time Tracking --- Worklog Id: (was: 839568) Time Spent: 1h 10m (was: 1h) > Cancel Compactions in initiated state > - > > Key: HIVE-26804 > URL: https://issues.apache.org/jira/browse/HIVE-26804 > Project: Hive > Issue Type: New Feature > Components: Hive >Reporter: KIRTI RUGE >Assignee: KIRTI RUGE >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26804) Cancel Compactions in initiated state
[ https://issues.apache.org/jira/browse/HIVE-26804?focusedWorklogId=839569=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839569 ] ASF GitHub Bot logged work on HIVE-26804: - Author: ASF GitHub Bot Created on: 17/Jan/23 10:18 Start Date: 17/Jan/23 10:18 Worklog Time Spent: 10m Work Description: rkirtir commented on code in PR #3880: URL: https://github.com/apache/hive/pull/3880#discussion_r1072023166 ## standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnQueries.java: ## @@ -50,4 +50,20 @@ public class TxnQueries { " \"CC_HIGHEST_WRITE_ID\"" + "FROM " + " \"COMPLETED_COMPACTIONS\" ) XX "; + + + public static final String SELECT_COMPACTION_QUEUE_BY_COMPID = "SELECT \"CQ_ID\", \"CQ_DATABASE\", \"CQ_TABLE\", \"CQ_PARTITION\", " ++ "\"CQ_STATE\", \"CQ_TYPE\", \"CQ_TBLPROPERTIES\", \"CQ_WORKER_ID\", \"CQ_START\", \"CQ_RUN_AS\", " ++ "\"CQ_HIGHEST_WRITE_ID\", \"CQ_META_INFO\", \"CQ_HADOOP_JOB_ID\", \"CQ_ERROR_MESSAGE\", " ++ "\"CQ_ENQUEUE_TIME\", \"CQ_WORKER_VERSION\", \"CQ_INITIATOR_ID\", \"CQ_INITIATOR_VERSION\", " ++ "\"CQ_RETRY_RETENTION\", \"CQ_NEXT_TXN_ID\", \"CQ_TXN_ID\", \"CQ_COMMIT_TIME\", \"CQ_POOL_NAME\" " ++ "FROM \"COMPACTION_QUEUE\" WHERE \"CQ_ID\" = ? AND \"CQ_STATE\" ='i'"; + + public static final String INSERT_INTO_COMPLETED_COMPACTION = "INSERT INTO \"COMPLETED_COMPACTIONS\" " Review Comment: fixed Issue Time Tracking --- Worklog Id: (was: 839569) Time Spent: 1h 20m (was: 1h 10m) > Cancel Compactions in initiated state > - > > Key: HIVE-26804 > URL: https://issues.apache.org/jira/browse/HIVE-26804 > Project: Hive > Issue Type: New Feature > Components: Hive >Reporter: KIRTI RUGE >Assignee: KIRTI RUGE >Priority: Major > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26804) Cancel Compactions in initiated state
[ https://issues.apache.org/jira/browse/HIVE-26804?focusedWorklogId=839567=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839567 ] ASF GitHub Bot logged work on HIVE-26804: - Author: ASF GitHub Bot Created on: 17/Jan/23 10:17 Start Date: 17/Jan/23 10:17 Worklog Time Spent: 10m Work Description: rkirtir commented on code in PR #3880: URL: https://github.com/apache/hive/pull/3880#discussion_r1072022432 ## standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift: ## @@ -1393,6 +1393,22 @@ struct ShowCompactResponse { 1: required list compacts, } +struct AbortCompactionRequest { +1: required list compactionIds, +2: optional string type, +3: optional string poolName +} + +struct AbortCompactionResponseElement { +1: required i64 compactionIds, Review Comment: fixed ## ql/src/java/org/apache/hadoop/hive/ql/ddl/process/abort/compaction/AbortCompactionsOperation.java: ## @@ -0,0 +1,81 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hive.ql.ddl.process.abort.compaction; + +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.hive.metastore.api.AbortCompactResponse; +import org.apache.hadoop.hive.metastore.api.AbortCompactionRequest; +import org.apache.hadoop.hive.metastore.api.AbortCompactionResponseElement; +import org.apache.hadoop.hive.ql.ddl.DDLOperation; +import org.apache.hadoop.hive.ql.ddl.DDLOperationContext; +import org.apache.hadoop.hive.ql.ddl.ShowUtils; +import org.apache.hadoop.hive.ql.exec.Utilities; +import org.apache.hadoop.hive.ql.metadata.HiveException; + +import java.io.DataOutputStream; +import java.io.IOException; + + +/** + * Operation process of aborting compactions. + */ +public class AbortCompactionsOperation extends DDLOperation { +public AbortCompactionsOperation(DDLOperationContext context, AbortCompactionsDesc desc) { +super(context, desc); +} + +@Override +public int execute() throws HiveException { +AbortCompactionRequest request = new AbortCompactionRequest(); +request.setCompactionIds(desc.getCompactionIds()); +AbortCompactResponse response = context.getDb().abortCompactions(request); +try (DataOutputStream os = ShowUtils.getOutputStream(new Path(desc.getResFile()), context)) { +writeHeader(os); +if (response.getAbortedcompacts() != null) { +for (AbortCompactionResponseElement e : response.getAbortedcompacts()) { +writeRow(os, e); +} +} +} catch (Exception e) { +LOG.warn("show compactions: ", e); Review Comment: fixed Issue Time Tracking --- Worklog Id: (was: 839567) Time Spent: 1h (was: 50m) > Cancel Compactions in initiated state > - > > Key: HIVE-26804 > URL: https://issues.apache.org/jira/browse/HIVE-26804 > Project: Hive > Issue Type: New Feature > Components: Hive >Reporter: KIRTI RUGE >Assignee: KIRTI RUGE >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26904) QueryCompactor failed in commitCompaction if the tmp table dir is already removed
[ https://issues.apache.org/jira/browse/HIVE-26904?focusedWorklogId=839565=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839565 ] ASF GitHub Bot logged work on HIVE-26904: - Author: ASF GitHub Bot Created on: 17/Jan/23 10:16 Start Date: 17/Jan/23 10:16 Worklog Time Spent: 10m Work Description: stiga-huang commented on code in PR #3910: URL: https://github.com/apache/hive/pull/3910#discussion_r1072021489 ## ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/QueryCompactor.java: ## @@ -245,16 +243,32 @@ private static void disableLlapCaching(HiveConf conf) { * @throws IOException the directory cannot be deleted * @throws HiveException the table is not found */ -static void cleanupEmptyDir(HiveConf conf, String tmpTableName) throws IOException, HiveException { +static void cleanupEmptyTableDir(HiveConf conf, String tmpTableName) +throws IOException, HiveException { org.apache.hadoop.hive.ql.metadata.Table tmpTable = Hive.get().getTable(tmpTableName); if (tmpTable != null) { -Path path = new Path(tmpTable.getSd().getLocation()); -FileSystem fs = path.getFileSystem(conf); +cleanupEmptyDir(conf, new Path(tmpTable.getSd().getLocation())); + } +} + +/** + * Remove the directory if it's empty. + * @param conf the Hive configuration + * @param path path of the directory + * @throws IOException if any IO error occurs + */ +static void cleanupEmptyDir(HiveConf conf, Path path) throws IOException { + FileSystem fs = path.getFileSystem(conf); + try { if (!fs.listFiles(path, false).hasNext()) { fs.delete(path, true); } + } catch (FileNotFoundException e) { +// Ignore the case when the dir was already removed +LOG.warn("Ignored exception during cleanup {}", path, e); Review Comment: It could be deleted before `listFiles()`. The `FileNotFoundException` is thrown from `listFiles()`. Issue Time Tracking --- Worklog Id: (was: 839565) Time Spent: 40m (was: 0.5h) > QueryCompactor failed in commitCompaction if the tmp table dir is already > removed > -- > > Key: HIVE-26904 > URL: https://issues.apache.org/jira/browse/HIVE-26904 > Project: Hive > Issue Type: Bug >Reporter: Quanlong Huang >Assignee: Quanlong Huang >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > commitCompaction() of query-based compactions just remove the dirs of tmp > tables. It should not fail the compaction if the dirs are already removed. > We've seen such a failure in Impala's test (IMPALA-11756): > {noformat} > 2023-01-02T02:09:26,306 INFO [HiveServer2-Background-Pool: Thread-695] > ql.Driver: Executing > command(queryId=jenkins_20230102020926_69112755-b783-4214-89e5-1c7111dfe15f): > alter table partial_catalog_info_test.insert_only_partitioned partition > (part=1) compact 'minor' and wait > 2023-01-02T02:09:26,306 INFO [HiveServer2-Background-Pool: Thread-695] > ql.Driver: Starting task [Stage-0:DDL] in serial mode > 2023-01-02T02:09:26,317 INFO [HiveServer2-Background-Pool: Thread-695] > exec.Task: Compaction enqueued with id 15 > ... > 2023-01-02T02:12:55,849 ERROR > [impala-ec2-centos79-m6i-4xlarge-ondemand-1428.vpc.cloudera.com-48_executor] > compactor.Worker: Caught exception while trying to compact > id:15,dbname:partial_catalog_info_test,tableName:insert_only_partitioned,partName:part=1,state:^@,type:MINOR,enqueueTime:0,start:0,properties:null,runAs:jenkins,tooManyAborts:false,hasOldAbort:false,highestWriteId:3,errorMessage:null,workerId: > null,initiatorId: null,retryRetention0. Marking failed to avoid repeated > failures > java.io.FileNotFoundException: File > hdfs://localhost:20500/tmp/hive/jenkins/092b533a-81c8-4b95-88e4-9472cf6f365d/_tmp_space.db/62ec04fb-e2d2-4a99-a454-ae709a3cccfe > does not exist. > at > org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1275) > ~[hadoop-hdfs-client-3.1.1.7.2.15.4-6.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1249) > ~[hadoop-hdfs-client-3.1.1.7.2.15.4-6.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1194) > ~[hadoop-hdfs-client-3.1.1.7.2.15.4-6.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1190) > ~[hadoop-hdfs-client-3.1.1.7.2.15.4-6.jar:?] > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) >
[jira] [Assigned] (HIVE-26954) Upgrade Avro to 1.11.1
[ https://issues.apache.org/jira/browse/HIVE-26954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akshat Mathur reassigned HIVE-26954: > Upgrade Avro to 1.11.1 > -- > > Key: HIVE-26954 > URL: https://issues.apache.org/jira/browse/HIVE-26954 > Project: Hive > Issue Type: Improvement >Reporter: Akshat Mathur >Assignee: Akshat Mathur >Priority: Major > > Upgrade Avro dependencies to 1.11.1 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26793) Create a new configuration to override "no compaction" for tables
[ https://issues.apache.org/jira/browse/HIVE-26793?focusedWorklogId=839559=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839559 ] ASF GitHub Bot logged work on HIVE-26793: - Author: ASF GitHub Bot Created on: 17/Jan/23 10:03 Start Date: 17/Jan/23 10:03 Worklog Time Spent: 10m Work Description: deniskuzZ commented on code in PR #3822: URL: https://github.com/apache/hive/pull/3822#discussion_r1072005883 ## ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java: ## @@ -613,8 +613,9 @@ private boolean isEligibleForCompaction(CompactionInfo ci, return false; } - if (isNoAutoCompactSet(t.getParameters())) { -LOG.info("Table " + tableName(t) + " marked " + hive_metastoreConstants.TABLE_NO_AUTO_COMPACT + + Map dbParams = computeIfAbsent(ci.dbname, () -> resolveDatabase(ci)).getParameters(); Review Comment: if (replIsCompactionDisabledForTable(t)) { skipTables.add(ci.getFullTableName()); return false; } Would be better if we could refactor the method and use a cache of skipTables/skipDBs instead of doing the same evaluation (isNoAutoCompactSet) for every Table in skipped Db / every Partition of skipped Table Issue Time Tracking --- Worklog Id: (was: 839559) Time Spent: 3h 50m (was: 3h 40m) > Create a new configuration to override "no compaction" for tables > - > > Key: HIVE-26793 > URL: https://issues.apache.org/jira/browse/HIVE-26793 > Project: Hive > Issue Type: Improvement >Reporter: Kokila N >Assignee: Kokila N >Priority: Major > Labels: pull-request-available > Time Spent: 3h 50m > Remaining Estimate: 0h > > Currently a simple user can create a table with > {color:#6a8759}no_auto_compaction=true{color} table property and create an > aborted write transaction writing to this table. This way a malicious user > can prevent cleaning up data for the aborted transaction, creating > performance degradation. > This configuration should be allowed to overridden on a database level: > adding {color:#6a8759}no_auto_compaction=false{color} should override the > table level setting forcing the initiator to schedule compaction for all > tables. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26793) Create a new configuration to override "no compaction" for tables
[ https://issues.apache.org/jira/browse/HIVE-26793?focusedWorklogId=839558=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839558 ] ASF GitHub Bot logged work on HIVE-26793: - Author: ASF GitHub Bot Created on: 17/Jan/23 10:02 Start Date: 17/Jan/23 10:02 Worklog Time Spent: 10m Work Description: deniskuzZ commented on code in PR #3822: URL: https://github.com/apache/hive/pull/3822#discussion_r1072005883 ## ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java: ## @@ -613,8 +613,9 @@ private boolean isEligibleForCompaction(CompactionInfo ci, return false; } - if (isNoAutoCompactSet(t.getParameters())) { -LOG.info("Table " + tableName(t) + " marked " + hive_metastoreConstants.TABLE_NO_AUTO_COMPACT + + Map dbParams = computeIfAbsent(ci.dbname, () -> resolveDatabase(ci)).getParameters(); Review Comment: if (replIsCompactionDisabledForTable(t)) { skipTables.add(ci.getFullTableName()); return false; } Would be better if we could refactor the method and use a cache of skipTables/skipDBs instead of doing the same evaluation for every Table in skipped Db / every Partition of skipped Table Issue Time Tracking --- Worklog Id: (was: 839558) Time Spent: 3h 40m (was: 3.5h) > Create a new configuration to override "no compaction" for tables > - > > Key: HIVE-26793 > URL: https://issues.apache.org/jira/browse/HIVE-26793 > Project: Hive > Issue Type: Improvement >Reporter: Kokila N >Assignee: Kokila N >Priority: Major > Labels: pull-request-available > Time Spent: 3h 40m > Remaining Estimate: 0h > > Currently a simple user can create a table with > {color:#6a8759}no_auto_compaction=true{color} table property and create an > aborted write transaction writing to this table. This way a malicious user > can prevent cleaning up data for the aborted transaction, creating > performance degradation. > This configuration should be allowed to overridden on a database level: > adding {color:#6a8759}no_auto_compaction=false{color} should override the > table level setting forcing the initiator to schedule compaction for all > tables. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26904) QueryCompactor failed in commitCompaction if the tmp table dir is already removed
[ https://issues.apache.org/jira/browse/HIVE-26904?focusedWorklogId=839551=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839551 ] ASF GitHub Bot logged work on HIVE-26904: - Author: ASF GitHub Bot Created on: 17/Jan/23 09:55 Start Date: 17/Jan/23 09:55 Worklog Time Spent: 10m Work Description: deniskuzZ commented on code in PR #3910: URL: https://github.com/apache/hive/pull/3910#discussion_r1071993848 ## ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/QueryCompactor.java: ## @@ -245,16 +243,32 @@ private static void disableLlapCaching(HiveConf conf) { * @throws IOException the directory cannot be deleted * @throws HiveException the table is not found */ -static void cleanupEmptyDir(HiveConf conf, String tmpTableName) throws IOException, HiveException { +static void cleanupEmptyTableDir(HiveConf conf, String tmpTableName) +throws IOException, HiveException { org.apache.hadoop.hive.ql.metadata.Table tmpTable = Hive.get().getTable(tmpTableName); if (tmpTable != null) { -Path path = new Path(tmpTable.getSd().getLocation()); -FileSystem fs = path.getFileSystem(conf); +cleanupEmptyDir(conf, new Path(tmpTable.getSd().getLocation())); + } +} + +/** + * Remove the directory if it's empty. + * @param conf the Hive configuration + * @param path path of the directory + * @throws IOException if any IO error occurs + */ +static void cleanupEmptyDir(HiveConf conf, Path path) throws IOException { + FileSystem fs = path.getFileSystem(conf); + try { if (!fs.listFiles(path, false).hasNext()) { fs.delete(path, true); } + } catch (FileNotFoundException e) { +// Ignore the case when the dir was already removed +LOG.warn("Ignored exception during cleanup {}", path, e); Review Comment: tmpDir gets deleted between the listing and the actual delete command? Issue Time Tracking --- Worklog Id: (was: 839551) Time Spent: 0.5h (was: 20m) > QueryCompactor failed in commitCompaction if the tmp table dir is already > removed > -- > > Key: HIVE-26904 > URL: https://issues.apache.org/jira/browse/HIVE-26904 > Project: Hive > Issue Type: Bug >Reporter: Quanlong Huang >Assignee: Quanlong Huang >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > commitCompaction() of query-based compactions just remove the dirs of tmp > tables. It should not fail the compaction if the dirs are already removed. > We've seen such a failure in Impala's test (IMPALA-11756): > {noformat} > 2023-01-02T02:09:26,306 INFO [HiveServer2-Background-Pool: Thread-695] > ql.Driver: Executing > command(queryId=jenkins_20230102020926_69112755-b783-4214-89e5-1c7111dfe15f): > alter table partial_catalog_info_test.insert_only_partitioned partition > (part=1) compact 'minor' and wait > 2023-01-02T02:09:26,306 INFO [HiveServer2-Background-Pool: Thread-695] > ql.Driver: Starting task [Stage-0:DDL] in serial mode > 2023-01-02T02:09:26,317 INFO [HiveServer2-Background-Pool: Thread-695] > exec.Task: Compaction enqueued with id 15 > ... > 2023-01-02T02:12:55,849 ERROR > [impala-ec2-centos79-m6i-4xlarge-ondemand-1428.vpc.cloudera.com-48_executor] > compactor.Worker: Caught exception while trying to compact > id:15,dbname:partial_catalog_info_test,tableName:insert_only_partitioned,partName:part=1,state:^@,type:MINOR,enqueueTime:0,start:0,properties:null,runAs:jenkins,tooManyAborts:false,hasOldAbort:false,highestWriteId:3,errorMessage:null,workerId: > null,initiatorId: null,retryRetention0. Marking failed to avoid repeated > failures > java.io.FileNotFoundException: File > hdfs://localhost:20500/tmp/hive/jenkins/092b533a-81c8-4b95-88e4-9472cf6f365d/_tmp_space.db/62ec04fb-e2d2-4a99-a454-ae709a3cccfe > does not exist. > at > org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1275) > ~[hadoop-hdfs-client-3.1.1.7.2.15.4-6.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1249) > ~[hadoop-hdfs-client-3.1.1.7.2.15.4-6.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1194) > ~[hadoop-hdfs-client-3.1.1.7.2.15.4-6.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1190) > ~[hadoop-hdfs-client-3.1.1.7.2.15.4-6.jar:?] > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > ~[hadoop-common-3.1.1.7.2.15.4-6.jar:?] >
[jira] [Work logged] (HIVE-22977) Merge delta files instead of running a query in major/minor compaction
[ https://issues.apache.org/jira/browse/HIVE-22977?focusedWorklogId=839547=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839547 ] ASF GitHub Bot logged work on HIVE-22977: - Author: ASF GitHub Bot Created on: 17/Jan/23 09:43 Start Date: 17/Jan/23 09:43 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #3801: URL: https://github.com/apache/hive/pull/3801#issuecomment-1385108588 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive=3801) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3801=false=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3801=false=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3801=false=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=CODE_SMELL) [2 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive=3801=coverage=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive=3801=duplicated_lines_density=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 839547) Time Spent: 4.5h (was: 4h 20m) > Merge delta files instead of running a query in major/minor compaction > -- > > Key: HIVE-22977 > URL: https://issues.apache.org/jira/browse/HIVE-22977 > Project: Hive > Issue Type: Improvement >Reporter: László Pintér >Assignee: Sourabh Badhya >Priority: Major > Labels: pull-request-available > Attachments: HIVE-22977.01.patch, HIVE-22977.02.patch > > Time Spent: 4.5h > Remaining Estimate: 0h > > [Compaction Optimiziation] > We should analyse the possibility to move a delta file instead of running a > major/minor compaction query. > Please consider the following use cases: > - full acid table but only insert queries were run. This means that no > delete delta directories were created. Is it possible to merge the delta > directory contents without running a compaction query? > - full acid table, initiating queries through the streaming API. If there > are no abort transactions during the streaming, is it possible to merge the > delta directory contents without running a compaction query? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26943) Fix NPE during Optimised Bootstrap when db is dropped
[ https://issues.apache.org/jira/browse/HIVE-26943?focusedWorklogId=839546=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839546 ] ASF GitHub Bot logged work on HIVE-26943: - Author: ASF GitHub Bot Created on: 17/Jan/23 09:42 Start Date: 17/Jan/23 09:42 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #3953: URL: https://github.com/apache/hive/pull/3953#issuecomment-1385107300 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive=3953) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=3953=false=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3953=false=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive=3953=false=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=3953=false=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3953=false=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=3953=false=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3953=false=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3953=false=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3953=false=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive=3953=false=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3953=false=CODE_SMELL) [0 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive=3953=false=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive=3953=coverage=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive=3953=duplicated_lines_density=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 839546) Time Spent: 40m (was: 0.5h) > Fix NPE during Optimised Bootstrap when db is dropped > - > > Key: HIVE-26943 > URL: https://issues.apache.org/jira/browse/HIVE-26943 > Project: Hive > Issue Type: Task >Reporter: Shreenidhi >Assignee: Shreenidhi >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > Consider the steps: > 1. Current replication is from A (source) -> B(target) > 2. Failover is complete > so now A (target) <- B(source) > 3. Suppose db at A is dropped before reverse replication. > 4. Now when reverse replication triggers optimised bootstrap it will throw NPE > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-26953) Exception in alter partitions with oracle db when partitions are more than 1000. Exception: ORA-01795: maximum number of expressions in a list is 1000
[ https://issues.apache.org/jira/browse/HIVE-26953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Venugopal Reddy K updated HIVE-26953: - Attachment: (was: partdata1001-1) > Exception in alter partitions with oracle db when partitions are more than > 1000. Exception: ORA-01795: maximum number of expressions in a list is 1000 > -- > > Key: HIVE-26953 > URL: https://issues.apache.org/jira/browse/HIVE-26953 > Project: Hive > Issue Type: Bug >Reporter: Venugopal Reddy K >Priority: Major > Attachments: partdata1001 > > > *[Description]* > Alter partitions with oracle db throws exception when the number of > partitions are more than 1000. Oracle db has limitation on number of values > passed in the IN operator. It cannot exceed 1000. > *[Steps to reproduce]* > Create stage table, load data that has 1000+ rows into stage table, create > partition table and load data into the table from the stage table. data > file[^partdata1001] is attached below. > > {code:java} > 0: jdbc:hive2://localhost:1> create database mydb; > 0: jdbc:hive2://localhost:1> use mydb; > > 0: jdbc:hive2://localhost:1> create table stage(sr int, st string, name > string) row format delimited fields terminated by '\t' stored as textfile; > > 0: jdbc:hive2://localhost:1> load data local inpath 'partdata1001' into > table stage; > > 0: jdbc:hive2://localhost:1> create table dynpart(num int, name string) > partitioned by (category string) row format delimited fields terminated by > '\t' stored as textfile; > > 0: jdbc:hive2://localhost:1> insert into dynpart select * from stage; > {code} > > *Alter partition throws exception(ORA-01795: maximum number of expressions in > a list is 1000) during BasicStatsTask.aggregateStats. This issue occurs with > oracle db due to its limitation of number of values in the IN operator.* > *[Exception Stack]* > > {code:java} > NestedThrowables: > java.sql.SQLSyntaxErrorException: ORA-01795: maximum number of expressions in > a list is 1000 > at org.apache.hadoop.hive.ql.metadata.Hive.alterPartitions(Hive.java:1145) > at > org.apache.hadoop.hive.ql.stats.BasicStatsTask.aggregateStats(BasicStatsTask.java:380) > at > org.apache.hadoop.hive.ql.stats.BasicStatsTask.process(BasicStatsTask.java:108) > at org.apache.hadoop.hive.ql.exec.StatsTask.execute(StatsTask.java:107) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:214) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) > at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:354) > at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:327) > at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:244) > at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:105) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:370) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:205) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:154) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:149) > at > org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:185) > at > org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:236) > at > org.apache.hive.service.cli.operation.SQLOperation.access$500(SQLOperation.java:90) > at > org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:340) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682) > at > org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:360) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: MetaException(message:javax.jdo.JDOException: Exception thrown > when executing query : SELECT DISTINCT > 'org.apache.hadoop.hive.metastore.model.MPartition' AS > DN_TYPE,A0.CREATE_TIME,A0.LAST_ACCESS_TIME,A0.PART_NAME,A0.WRITE_ID,A0.PART_ID > FROM PARTITIONS A0 LEFT OUTER JOIN TBLS B0 ON A0.TBL_ID = B0.TBL_ID LEFT > OUTER JOIN DBS C0 ON B0.DB_ID = C0.DB_ID WHERE B0.TBL_NAME = ? AND C0."NAME" > = ? AND A0.PART_NAME >
[jira] [Updated] (HIVE-26953) Exception in alter partitions with oracle db when partitions are more than 1000. Exception: ORA-01795: maximum number of expressions in a list is 1000
[ https://issues.apache.org/jira/browse/HIVE-26953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Venugopal Reddy K updated HIVE-26953: - Attachment: partdata1001-1 partdata1001 Description: *[Description]* Alter partitions with oracle db throws exception when the number of partitions are more than 1000. Oracle db has limitation on number of values passed in the IN operator. It cannot exceed 1000. *[Steps to reproduce]* Create stage table, load data that has 1000+ rows into stage table, create partition table and load data into the table from the stage table. data file[^partdata1001] is attached below. {code:java} 0: jdbc:hive2://localhost:1> create database mydb; 0: jdbc:hive2://localhost:1> use mydb; 0: jdbc:hive2://localhost:1> create table stage(sr int, st string, name string) row format delimited fields terminated by '\t' stored as textfile; 0: jdbc:hive2://localhost:1> load data local inpath 'partdata1001' into table stage; 0: jdbc:hive2://localhost:1> create table dynpart(num int, name string) partitioned by (category string) row format delimited fields terminated by '\t' stored as textfile; 0: jdbc:hive2://localhost:1> insert into dynpart select * from stage; {code} *Alter partition throws exception(ORA-01795: maximum number of expressions in a list is 1000) during BasicStatsTask.aggregateStats. This issue occurs with oracle db due to its limitation of number of values in the IN operator.* *[Exception Stack]* {code:java} NestedThrowables: java.sql.SQLSyntaxErrorException: ORA-01795: maximum number of expressions in a list is 1000 at org.apache.hadoop.hive.ql.metadata.Hive.alterPartitions(Hive.java:1145) at org.apache.hadoop.hive.ql.stats.BasicStatsTask.aggregateStats(BasicStatsTask.java:380) at org.apache.hadoop.hive.ql.stats.BasicStatsTask.process(BasicStatsTask.java:108) at org.apache.hadoop.hive.ql.exec.StatsTask.execute(StatsTask.java:107) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:214) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:354) at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:327) at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:244) at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:105) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:370) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:205) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:154) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:149) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:185) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:236) at org.apache.hive.service.cli.operation.SQLOperation.access$500(SQLOperation.java:90) at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:340) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682) at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:360) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: MetaException(message:javax.jdo.JDOException: Exception thrown when executing query : SELECT DISTINCT 'org.apache.hadoop.hive.metastore.model.MPartition' AS DN_TYPE,A0.CREATE_TIME,A0.LAST_ACCESS_TIME,A0.PART_NAME,A0.WRITE_ID,A0.PART_ID FROM PARTITIONS A0 LEFT OUTER JOIN TBLS B0 ON A0.TBL_ID = B0.TBL_ID LEFT OUTER JOIN DBS C0 ON B0.DB_ID = C0.DB_ID WHERE B0.TBL_NAME = ? AND C0."NAME" = ? AND A0.PART_NAME
[jira] [Assigned] (HIVE-26950) (CTLT) Create external table like V2 table is not preserving table properties
[ https://issues.apache.org/jira/browse/HIVE-26950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena reassigned HIVE-26950: --- Assignee: Ayush Saxena > (CTLT) Create external table like V2 table is not preserving table properties > - > > Key: HIVE-26950 > URL: https://issues.apache.org/jira/browse/HIVE-26950 > Project: Hive > Issue Type: Improvement > Components: Iceberg integration >Reporter: Rajesh Balamohan >Assignee: Ayush Saxena >Priority: Major > > # Create an external iceberg V2 table. e.g t1 > # "create external table t2 like t1" <--- This ends up creating V1 table and > "format-version=2" is not retained and "'format'='iceberg/parquet'" is also > not retained. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-26933) Cleanup dump directory for eventId which was failed in previous dump cycle
[ https://issues.apache.org/jira/browse/HIVE-26933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-26933: -- Labels: pull-request-available (was: ) > Cleanup dump directory for eventId which was failed in previous dump cycle > -- > > Key: HIVE-26933 > URL: https://issues.apache.org/jira/browse/HIVE-26933 > Project: Hive > Issue Type: Improvement >Reporter: Harshal Patel >Assignee: Harshal Patel >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > # If Incremental Dump operation failes while dumping any event id in the > staging directory. Then dump directory for this event id along with file > _dumpmetadata still exists in the dump location. which is getting stored in > _events_dump file > # When user triggers dump operation for this policy again, It again resumes > dumping from failed event id, and tries to dump it again but as that event id > directory already created in previous cycle, it fails with the exception > {noformat} > [Scheduled Query Executor(schedule:repl_policytest7, execution_id:7181)]: > FAILED: Execution Error, return code 4 from > org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask. > org.apache.hadoop.fs.FileAlreadyExistsException: > /warehouse/tablespace/staging/policytest7/dGVzdDc=/14bcf976-662b-4237-b5bb-e7d63a1d089f/hive/137961/_dumpmetadata > for client 172.27.182.5 already exists > at > org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.startFile(FSDirWriteFileOp.java:388) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2576) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2473) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:773) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:490) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:533) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:989) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:917) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2894){noformat} > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26933) Cleanup dump directory for eventId which was failed in previous dump cycle
[ https://issues.apache.org/jira/browse/HIVE-26933?focusedWorklogId=839538=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839538 ] ASF GitHub Bot logged work on HIVE-26933: - Author: ASF GitHub Bot Created on: 17/Jan/23 08:58 Start Date: 17/Jan/23 08:58 Worklog Time Spent: 10m Work Description: harshal-16 opened a new pull request, #3960: URL: https://github.com/apache/hive/pull/3960 Problem: - If Incremental Dump operation failes while dumping any event id in the staging directory. Then dump directory for this event id along with file _dumpmetadata still exists in the dump location. which is getting stored in _events_dump file - When user triggers dump operation for this policy again, It again resumes dumping from failed event id, and tries to dump it again but as that event id directory already created in previous cycle, it fails with the exception Solution: - fixed cleanFailedEventDirIfExists to remove folder for failed event id for a selected database Issue Time Tracking --- Worklog Id: (was: 839538) Remaining Estimate: 0h Time Spent: 10m > Cleanup dump directory for eventId which was failed in previous dump cycle > -- > > Key: HIVE-26933 > URL: https://issues.apache.org/jira/browse/HIVE-26933 > Project: Hive > Issue Type: Improvement >Reporter: Harshal Patel >Assignee: Harshal Patel >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > # If Incremental Dump operation failes while dumping any event id in the > staging directory. Then dump directory for this event id along with file > _dumpmetadata still exists in the dump location. which is getting stored in > _events_dump file > # When user triggers dump operation for this policy again, It again resumes > dumping from failed event id, and tries to dump it again but as that event id > directory already created in previous cycle, it fails with the exception > {noformat} > [Scheduled Query Executor(schedule:repl_policytest7, execution_id:7181)]: > FAILED: Execution Error, return code 4 from > org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask. > org.apache.hadoop.fs.FileAlreadyExistsException: > /warehouse/tablespace/staging/policytest7/dGVzdDc=/14bcf976-662b-4237-b5bb-e7d63a1d089f/hive/137961/_dumpmetadata > for client 172.27.182.5 already exists > at > org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.startFile(FSDirWriteFileOp.java:388) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2576) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2473) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:773) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:490) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:533) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:989) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:917) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2894){noformat} > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26932) Correct stage name value in replication_metrics.progress column in replication_metrics table
[ https://issues.apache.org/jira/browse/HIVE-26932?focusedWorklogId=839537=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839537 ] ASF GitHub Bot logged work on HIVE-26932: - Author: ASF GitHub Bot Created on: 17/Jan/23 08:56 Start Date: 17/Jan/23 08:56 Worklog Time Spent: 10m Work Description: harshal-16 closed pull request #3958: HIVE-26932: Cleanup dump directory for eventId which was failed in previous dump cycle URL: https://github.com/apache/hive/pull/3958 Issue Time Tracking --- Worklog Id: (was: 839537) Time Spent: 40m (was: 0.5h) > Correct stage name value in replication_metrics.progress column in > replication_metrics table > > > Key: HIVE-26932 > URL: https://issues.apache.org/jira/browse/HIVE-26932 > Project: Hive > Issue Type: Improvement >Reporter: Harshal Patel >Assignee: Harshal Patel >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > To improve diagnostic capability from Source to backup replication, update > replication_metrics table by adding pre_optimized_bootstrap in progress bar > in case of optimized bootstrap first cycle. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26952) set the value of metastore.storage.schema.reader.impl to org.apache.hadoop.hive.metastore.SerDeStorageSchemaReader as default
[ https://issues.apache.org/jira/browse/HIVE-26952?focusedWorklogId=839535=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839535 ] ASF GitHub Bot logged work on HIVE-26952: - Author: ASF GitHub Bot Created on: 17/Jan/23 08:44 Start Date: 17/Jan/23 08:44 Worklog Time Spent: 10m Work Description: tarak271 opened a new pull request, #3959: URL: https://github.com/apache/hive/pull/3959 …o org.apache.hadoop.hive.metastore.SerDeStorageSchemaReader as default ### What changes were proposed in this pull request? set the value of the config metastore.storage.schema.reader.impl to org.apache.hadoop.hive.metastore.SerDeStorageSchemaReader as default. ### Why are the changes needed? Previously it was set to DefaultStorageSchemaReader with default message as "Storage schema reading not supported". SerDeStorageSchemaReader was introduced with implementation that can help read schema from storage. So proposing to make it as default value to avoid setting this config by users in future releases ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? There is no functionality change introduced, so existing test cases should not be failing with these config value changes Issue Time Tracking --- Worklog Id: (was: 839535) Remaining Estimate: 0h Time Spent: 10m > set the value of metastore.storage.schema.reader.impl to > org.apache.hadoop.hive.metastore.SerDeStorageSchemaReader as default > -- > > Key: HIVE-26952 > URL: https://issues.apache.org/jira/browse/HIVE-26952 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Reporter: Taraka Rama Rao Lethavadla >Assignee: Taraka Rama Rao Lethavadla >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > With the default value of > > {code:java} > DefaultStorageSchemaReader.class.getName(){code} > > in the Metastore Config, *metastore.storage.schema.reader.impl* > below exception is thrown when trying to read Avro schema > {noformat} > Caused by: org.apache.hive.service.cli.HiveSQLException: MetaException > (message:java.lang.UnsupportedOperationException: Storage schema reading not > supported) > at > org.apache.hive.service.cli.operation.GetColumnsOperation.runInternal(GetColumnsOperation.java:213) > at org.apache.hive.service.cli.operation.Operation.run(Operation.java:247) > at > org.apache.hive.service.cli.session.HiveSessionImpl.getColumns(HiveSessionImpl.java:729) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78) > at > org.apache.hive.service.cli.session.HiveSessionProxy.access-zsh(HiveSessionProxy.java:36) > at > org.apache.hive.service.cli.session.HiveSessionProxy.run(HiveSessionProxy.java:63) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) > at > org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59) > at com.sun.proxy..getColumns(Unknown Source) > at > org.apache.hive.service.cli.CLIService.getColumns(CLIService.java:390){noformat} > setting the above config with > *org.apache.hadoop.hive.metastore.SerDeStorageSchemaReader* resolves issue > Proposing to make this value as default in code base, so that in upcoming > versions we don't have to set this value manually -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-26952) set the value of metastore.storage.schema.reader.impl to org.apache.hadoop.hive.metastore.SerDeStorageSchemaReader as default
[ https://issues.apache.org/jira/browse/HIVE-26952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-26952: -- Labels: pull-request-available (was: ) > set the value of metastore.storage.schema.reader.impl to > org.apache.hadoop.hive.metastore.SerDeStorageSchemaReader as default > -- > > Key: HIVE-26952 > URL: https://issues.apache.org/jira/browse/HIVE-26952 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Reporter: Taraka Rama Rao Lethavadla >Assignee: Taraka Rama Rao Lethavadla >Priority: Minor > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > With the default value of > > {code:java} > DefaultStorageSchemaReader.class.getName(){code} > > in the Metastore Config, *metastore.storage.schema.reader.impl* > below exception is thrown when trying to read Avro schema > {noformat} > Caused by: org.apache.hive.service.cli.HiveSQLException: MetaException > (message:java.lang.UnsupportedOperationException: Storage schema reading not > supported) > at > org.apache.hive.service.cli.operation.GetColumnsOperation.runInternal(GetColumnsOperation.java:213) > at org.apache.hive.service.cli.operation.Operation.run(Operation.java:247) > at > org.apache.hive.service.cli.session.HiveSessionImpl.getColumns(HiveSessionImpl.java:729) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78) > at > org.apache.hive.service.cli.session.HiveSessionProxy.access-zsh(HiveSessionProxy.java:36) > at > org.apache.hive.service.cli.session.HiveSessionProxy.run(HiveSessionProxy.java:63) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) > at > org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59) > at com.sun.proxy..getColumns(Unknown Source) > at > org.apache.hive.service.cli.CLIService.getColumns(CLIService.java:390){noformat} > setting the above config with > *org.apache.hadoop.hive.metastore.SerDeStorageSchemaReader* resolves issue > Proposing to make this value as default in code base, so that in upcoming > versions we don't have to set this value manually -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26711) The very first REPL Load should make the Target Database read-only
[ https://issues.apache.org/jira/browse/HIVE-26711?focusedWorklogId=839528=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839528 ] ASF GitHub Bot logged work on HIVE-26711: - Author: ASF GitHub Bot Created on: 17/Jan/23 08:08 Start Date: 17/Jan/23 08:08 Worklog Time Spent: 10m Work Description: shreenidhiSaigaonkar commented on code in PR #3736: URL: https://github.com/apache/hive/pull/3736#discussion_r1071876874 ## itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplWithReadOnlyHook.java: ## @@ -0,0 +1,126 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hive.ql.parse; + +import static org.apache.hadoop.hive.ql.hooks.EnforceReadOnlyDatabaseHook.READONLY; +import static org.apache.hadoop.hive.common.repl.ReplConst.READ_ONLY_HOOK; +import static org.junit.Assert.assertEquals; + +import org.apache.hadoop.hdfs.MiniDFSCluster; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.metastore.conf.MetastoreConf; +import org.apache.hadoop.hive.metastore.messaging.json.gzip.GzipJSONMessageEncoder; +import org.apache.hadoop.hive.shims.Utils; +import org.junit.After; +import org.junit.BeforeClass; +import org.junit.Test; + +import java.util.HashMap; +import java.util.Map; + +public class TestReplWithReadOnlyHook extends BaseReplicationScenariosAcidTables { + + @BeforeClass + public static void classLevelSetup() throws Exception { +Map overrides = new HashMap<>(); +overrides.put(MetastoreConf.ConfVars.EVENT_MESSAGE_FACTORY.getHiveName(), + GzipJSONMessageEncoder.class.getCanonicalName()); + +conf = new HiveConf(TestReplWithReadOnlyHook.class); +conf.set("hadoop.proxyuser." + Utils.getUGI().getShortUserName() + ".hosts", "*"); + +MiniDFSCluster miniDFSCluster = + new MiniDFSCluster.Builder(conf).numDataNodes(2).format(true).build(); + +Map acidEnableConf = new HashMap() {{ Review Comment: Done Issue Time Tracking --- Worklog Id: (was: 839528) Time Spent: 1h 40m (was: 1.5h) > The very first REPL Load should make the Target Database read-only > -- > > Key: HIVE-26711 > URL: https://issues.apache.org/jira/browse/HIVE-26711 > Project: Hive > Issue Type: Task >Reporter: Shreenidhi >Assignee: Shreenidhi >Priority: Major > Labels: pull-request-available > Time Spent: 1h 40m > Remaining Estimate: 0h > > Use EnforceReadOnly hook to set TARGET database read only during BootStrap > load. > Also ensure backward compatibility. -- This message was sent by Atlassian Jira (v8.20.10#820010)