[jira] [Assigned] (HIVE-26955) Alter table change column data type of a Parquet table throws exception

2023-01-17 Thread Sourabh Badhya (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sourabh Badhya reassigned HIVE-26955:
-

Assignee: Sourabh Badhya

> Alter table change column data type of a Parquet table throws exception
> ---
>
> Key: HIVE-26955
> URL: https://issues.apache.org/jira/browse/HIVE-26955
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Taraka Rama Rao Lethavadla
>Assignee: Sourabh Badhya
>Priority: Major
>
> Steps to reproduce
> {noformat}
> create table test_parquet (id decimal) stored as parquet;
> insert into test_parquet values(238);
> alter table test_parquet change id id string;
> select * from test_parquet;
> Error: java.io.IOException: org.apache.parquet.io.ParquetDecodingException: 
> Can not read value at 1 in block 0 in file 
> hdfs:/namenode:8020/warehouse/tablespace/managed/hive/test_parquet/delta_001_001_/00_0
>  (state=,code=0)
>     at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:624)
>     at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:531)
>     at 
> org.apache.hadoop.hive.ql.exec.FetchTask.executeInner(FetchTask.java:194)
>     ... 55 more
> Caused by: org.apache.parquet.io.ParquetDecodingException: Can not read value 
> at 1 in block 0 in file 
> file:/home/centos/Apache-Hive-Tarak/itests/qtest/target/localfs/warehouse/test_parquet/00_0
>     at 
> org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:255)
>     at 
> org.apache.parquet.hadoop.ParquetRecordReader.nextKeyValue(ParquetRecordReader.java:207)
>     at 
> org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.(ParquetRecordReaderWrapper.java:87)
>     at 
> org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:89)
>     at 
> org.apache.hadoop.hive.ql.exec.FetchOperator$FetchInputFormatSplit.getRecordReader(FetchOperator.java:771)
>     at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:335)
>     at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:562)
>     ... 57 more
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.serde2.typeinfo.PrimitiveTypeInfo cannot be cast to 
> org.apache.hadoop.hive.serde2.typeinfo.DecimalTypeInfo
>     at 
> org.apache.hadoop.hive.ql.io.parquet.convert.ETypeConverter$8$5.convert(ETypeConverter.java:669)
>     at 
> org.apache.hadoop.hive.ql.io.parquet.convert.ETypeConverter$8$5.convert(ETypeConverter.java:664)
>     at 
> org.apache.hadoop.hive.ql.io.parquet.convert.ETypeConverter$BinaryConverter.addBinary(ETypeConverter.java:977)
>     at 
> org.apache.parquet.column.impl.ColumnReaderBase$2$6.writeValue(ColumnReaderBase.java:360)
>     at 
> org.apache.parquet.column.impl.ColumnReaderBase.writeCurrentValueToConverter(ColumnReaderBase.java:410)
>     at 
> org.apache.parquet.column.impl.ColumnReaderImpl.writeCurrentValueToConverter(ColumnReaderImpl.java:30)
>     at 
> org.apache.parquet.io.RecordReaderImplementation.read(RecordReaderImplementation.java:406)
>     at 
> org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:230)
>     ... 63 more{noformat}
> However the same is working as expected in ORC table
> {noformat}
> create table test_orc (id decimal) stored as orc;
> insert into test_orc values(238);
> alter table test_orc change id id string;
> select * from test_orc;
> +--+
> | test_orc.id  |
> +--+
> | 238          |
> +--+{noformat}
> As well as text table
> {noformat}
> create table test_text (id decimal) stored as textfile;
> insert into test_text values(238);
> alter table test_text change id id string;
> select * from test_text;
> +---+
> | test_text.id  |
> +---+
> | 238           |
> +---+{noformat}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26598) Fix unsetting of db params for optimized bootstrap when repl dump initiates data copy

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26598?focusedWorklogId=839845=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839845
 ]

ASF GitHub Bot logged work on HIVE-26598:
-

Author: ASF GitHub Bot
Created on: 18/Jan/23 07:37
Start Date: 18/Jan/23 07:37
Worklog Time Spent: 10m 
  Work Description: sonarcloud[bot] commented on PR #3780:
URL: https://github.com/apache/hive/pull/3780#issuecomment-1386612591

   Kudos, SonarCloud Quality Gate passed!  [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive=3780)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=3780=false=BUG)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=3780=false=BUG)
 [0 
Bugs](https://sonarcloud.io/project/issues?id=apache_hive=3780=false=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=3780=false=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=3780=false=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=3780=false=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3780=false=SECURITY_HOTSPOT)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3780=false=SECURITY_HOTSPOT)
 [0 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3780=false=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive=3780=false=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=3780=false=CODE_SMELL)
 [0 Code 
Smells](https://sonarcloud.io/project/issues?id=apache_hive=3780=false=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive=3780=coverage=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive=3780=duplicated_lines_density=list)
 No Duplication information
   
   




Issue Time Tracking
---

Worklog Id: (was: 839845)
Time Spent: 40m  (was: 0.5h)

> Fix unsetting of db params for optimized bootstrap when repl dump initiates 
> data copy
> -
>
> Key: HIVE-26598
> URL: https://issues.apache.org/jira/browse/HIVE-26598
> Project: Hive
>  Issue Type: Bug
>Reporter: Teddy Choi
>Assignee: Rakshith C
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> when hive.repl.run.data.copy.tasks.on.target is set to false, repl dump task 
> will initiate the copy task from source cluster to staging directory.
> In current code flow repl dump task dumps the metadata and then creates 
> another repl dump task with datacopyIterators initialized.
> when the second dump cycle executes, it directly begins data copy tasks. 
> Because of this we don't enter second reverse dump flow and  
> unsetDbPropertiesForOptimisedBootstrap is never set to true again.
> this results in db params (repl.target.for, repl.background.threads, etc) not 
> being unset.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-26959) CREATE TABLE with external.table.purge=false is ignored

2023-01-17 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17678103#comment-17678103
 ] 

Ayush Saxena commented on HIVE-26959:
-

Should have given create external table, rather tham just create table, without 
external keyword it will consider like you are trying for managed table, which 
will get translated to external table with purge true, which takes precedence 
over the one specificied in the table properties

> CREATE TABLE with external.table.purge=false is ignored
> ---
>
> Key: HIVE-26959
> URL: https://issues.apache.org/jira/browse/HIVE-26959
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0-alpha-2
>Reporter: Li Penglin
>Priority: Major
>
> We set the default external.table.purge=true in 
> https://issues.apache.org/jira/browse/HIVE-26064, but this property is still 
> true when I set it to false.
>  
> {code:java}
> select version();
> ++
> |                        _c0                         |
> ++
> | 4.0.0-alpha-2 r36f5d91acb0fac00a5d46049bd45b744fe9aaab6 |
> 0: jdbc:hive2://localhost:11050>  create table test_parq_hive (i int)         
>                                 
> . . . . . . . . . . . . . . . .>  stored as parquet                           
>                                                  
> . . . . . . . . . . . . . . . .>  tblproperties 
> ('external.table.purge'='false');                             
> INFO  : Compiling 
> command(queryId=root_20230118111342_c60c0f5f-3e2f-45af-a7ab-13f099b7ead9): 
> create table test_parq_hive (i int)
> stored as parquet                                                             
>                                                  
> tblproperties ('external.table.purge'='false')
> INFO  : Semantic Analysis Completed (retrial = false)
> 0: jdbc:hive2://localhost:11050> describe formatted test_parq_hive; 
> |                               | bucketing_version                           
>        | 2                     |                 
> |                               | external.table.purge                        
>        | TRUE                  |                                              
>                                                                               
>                     
> |                               | transient_lastDdlTime                       
>        | 1674011622            |    
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-22977) Merge delta files instead of running a query in major/minor compaction

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22977?focusedWorklogId=839844=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839844
 ]

ASF GitHub Bot logged work on HIVE-22977:
-

Author: ASF GitHub Bot
Created on: 18/Jan/23 07:27
Start Date: 18/Jan/23 07:27
Worklog Time Spent: 10m 
  Work Description: sonarcloud[bot] commented on PR #3801:
URL: https://github.com/apache/hive/pull/3801#issuecomment-1386603262

   Kudos, SonarCloud Quality Gate passed!  [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive=3801)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=BUG)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=BUG)
 [0 
Bugs](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3801=false=SECURITY_HOTSPOT)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3801=false=SECURITY_HOTSPOT)
 [0 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3801=false=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=CODE_SMELL)
 [2 Code 
Smells](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive=3801=coverage=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive=3801=duplicated_lines_density=list)
 No Duplication information
   
   




Issue Time Tracking
---

Worklog Id: (was: 839844)
Time Spent: 4h 50m  (was: 4h 40m)

> Merge delta files instead of running a query in major/minor compaction
> --
>
> Key: HIVE-22977
> URL: https://issues.apache.org/jira/browse/HIVE-22977
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Pintér
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22977.01.patch, HIVE-22977.02.patch
>
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> [Compaction Optimiziation]
> We should analyse the possibility to move a delta file instead of running a 
> major/minor compaction query.
> Please consider the following use cases:
>  - full acid table but only insert queries were run. This means that no 
> delete delta directories were created. Is it possible to merge the delta 
> directory contents without running a compaction query?
>  - full acid table, initiating queries through the streaming API. If there 
> are no abort transactions during the streaming, is it possible to merge the 
> delta directory contents without running a compaction query?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-26962) Expose resume/reset ready state through replication metrics when first cycle of resume/reset completes

2023-01-17 Thread Shreenidhi (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shreenidhi updated HIVE-26962:
--
Description: 
As resume/reset workflow also follows optimised bootstrap, so here we have 2 
cycles to mark this flow as complete.

1. 1st cycle will be triggered by orchestrator just when resume/reset action 
initiated. 
2. now to initiate another cycle orchestrator needs to know if the first cycle 
got complete. To do this we need a mechanism in hive where it puts 
RESUME/RESET_READY state in replication metrics once the first cycle of 
RESUME/RESET completes.
 * Once orchestrator sees the RESET_READY state, it will trigger another cycle 
and does necessary work which needs to be done to complete RESET workflow. 

  was:
As resume/reset workflow also follows optimised bootstrap, so here we have 2 
cycles to mark this flow as complete. 

1. 1st cycle will be triggered by orchestrator just when resume/reset action 
initiated. 
2. now to initiate another cycle orchestrator needs to know if the first cycle 
got complete. To do this we need a mechanism in hive where it puts 
RESUME/RESET_READY state in replication metrics once the first cycle of 
RESUME/RESET completes. 

* Once orchestrator sees the RESET_READY state, it will trigger the another 
cycle and does necessary work which needs to be done to complete RESET 
workflow. 


> Expose resume/reset ready state through replication metrics when first cycle 
> of resume/reset completes
> --
>
> Key: HIVE-26962
> URL: https://issues.apache.org/jira/browse/HIVE-26962
> Project: Hive
>  Issue Type: Task
>Reporter: Shreenidhi
>Assignee: Shreenidhi
>Priority: Major
>
> As resume/reset workflow also follows optimised bootstrap, so here we have 2 
> cycles to mark this flow as complete.
> 1. 1st cycle will be triggered by orchestrator just when resume/reset action 
> initiated. 
> 2. now to initiate another cycle orchestrator needs to know if the first 
> cycle got complete. To do this we need a mechanism in hive where it puts 
> RESUME/RESET_READY state in replication metrics once the first cycle of 
> RESUME/RESET completes.
>  * Once orchestrator sees the RESET_READY state, it will trigger another 
> cycle and does necessary work which needs to be done to complete RESET 
> workflow. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-26962) Expose resume/reset ready state through replication metrics when first cycle of resume/reset completes

2023-01-17 Thread Shreenidhi (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shreenidhi reassigned HIVE-26962:
-

Assignee: Shreenidhi

> Expose resume/reset ready state through replication metrics when first cycle 
> of resume/reset completes
> --
>
> Key: HIVE-26962
> URL: https://issues.apache.org/jira/browse/HIVE-26962
> Project: Hive
>  Issue Type: Task
>Reporter: Shreenidhi
>Assignee: Shreenidhi
>Priority: Major
>
> As resume/reset workflow also follows optimised bootstrap, so here we have 2 
> cycles to mark this flow as complete. 
> 1. 1st cycle will be triggered by orchestrator just when resume/reset action 
> initiated. 
> 2. now to initiate another cycle orchestrator needs to know if the first 
> cycle got complete. To do this we need a mechanism in hive where it puts 
> RESUME/RESET_READY state in replication metrics once the first cycle of 
> RESUME/RESET completes. 
> * Once orchestrator sees the RESET_READY state, it will trigger the another 
> cycle and does necessary work which needs to be done to complete RESET 
> workflow. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-26959) CREATE TABLE with external.table.purge=false is ignored

2023-01-17 Thread Li Penglin (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Penglin updated HIVE-26959:
--
Description: 
We set the default external.table.purge=true in 
https://issues.apache.org/jira/browse/HIVE-26064, but this property is still 
true when I set it to false.

 
{code:java}
select version();
++
|                        _c0                         |
++
| 4.0.0-alpha-2 r36f5d91acb0fac00a5d46049bd45b744fe9aaab6 |


0: jdbc:hive2://localhost:11050>  create table test_parq_hive (i int)           
                              
. . . . . . . . . . . . . . . .>  stored as parquet                             
                                               
. . . . . . . . . . . . . . . .>  tblproperties 
('external.table.purge'='false');                             
INFO  : Compiling 
command(queryId=root_20230118111342_c60c0f5f-3e2f-45af-a7ab-13f099b7ead9): 
create table test_parq_hive (i int)
stored as parquet                                                               
                                               
tblproperties ('external.table.purge'='false')
INFO  : Semantic Analysis Completed (retrial = false)


0: jdbc:hive2://localhost:11050> describe formatted test_parq_hive; 
|                               | bucketing_version                             
     | 2                     |                 
|                               | external.table.purge                          
     | TRUE                  |                                                  
                                                                                
              
|                               | transient_lastDdlTime                         
     | 1674011622            |    


{code}
 

 

  was:
We set the default external.table.purge=true in 
https://issues.apache.org/jira/browse/HIVE-26064, but this property is still 
true when I set it to false.

 
{code:java}
0: jdbc:hive2://localhost:11050>  create table test_parq_hive (i int)           
                              
. . . . . . . . . . . . . . . .>  stored as parquet                             
                                               
. . . . . . . . . . . . . . . .>  tblproperties 
('external.table.purge'='false');                             
INFO  : Compiling 
command(queryId=root_20230118111342_c60c0f5f-3e2f-45af-a7ab-13f099b7ead9): 
create table test_parq_hive (i int)
stored as parquet                                                               
                                               
tblproperties ('external.table.purge'='false')
INFO  : Semantic Analysis Completed (retrial = false)


0: jdbc:hive2://localhost:11050> describe formatted test_parq_hive; 
|                               | bucketing_version                             
     | 2                     |                 
|                               | external.table.purge                          
     | TRUE                  |                                                  
                                                                                
              
|                               | transient_lastDdlTime                         
     | 1674011622            |    


{code}
 

 


> CREATE TABLE with external.table.purge=false is ignored
> ---
>
> Key: HIVE-26959
> URL: https://issues.apache.org/jira/browse/HIVE-26959
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0-alpha-2
>Reporter: Li Penglin
>Priority: Major
>
> We set the default external.table.purge=true in 
> https://issues.apache.org/jira/browse/HIVE-26064, but this property is still 
> true when I set it to false.
>  
> {code:java}
> select version();
> ++
> |                        _c0                         |
> ++
> | 4.0.0-alpha-2 r36f5d91acb0fac00a5d46049bd45b744fe9aaab6 |
> 0: jdbc:hive2://localhost:11050>  create table test_parq_hive (i int)         
>                                 
> . . . . . . . . . . . . . . . .>  stored as parquet                           
>                                                  
> . . . . . . . . . . . . . . . .>  tblproperties 
> ('external.table.purge'='false');                             
> INFO  : Compiling 
> command(queryId=root_20230118111342_c60c0f5f-3e2f-45af-a7ab-13f099b7ead9): 
> create table test_parq_hive (i int)
> stored as parquet                                                             
>                                                  
> tblproperties ('external.table.purge'='false')
> INFO  : Semantic Analysis Completed (retrial = false)
> 0: jdbc:hive2://localhost:11050> 

[jira] [Updated] (HIVE-26959) CREATE TABLE with external.table.purge=false is ignored

2023-01-17 Thread Li Penglin (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Penglin updated HIVE-26959:
--
Affects Version/s: 4.0.0-alpha-2

> CREATE TABLE with external.table.purge=false is ignored
> ---
>
> Key: HIVE-26959
> URL: https://issues.apache.org/jira/browse/HIVE-26959
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0-alpha-2
>Reporter: Li Penglin
>Priority: Major
>
> We set the default external.table.purge=true in 
> https://issues.apache.org/jira/browse/HIVE-26064, but this property is still 
> true when I set it to false.
>  
> {code:java}
> 0: jdbc:hive2://localhost:11050>  create table test_parq_hive (i int)         
>                                 
> . . . . . . . . . . . . . . . .>  stored as parquet                           
>                                                  
> . . . . . . . . . . . . . . . .>  tblproperties 
> ('external.table.purge'='false');                             
> INFO  : Compiling 
> command(queryId=root_20230118111342_c60c0f5f-3e2f-45af-a7ab-13f099b7ead9): 
> create table test_parq_hive (i int)
> stored as parquet                                                             
>                                                  
> tblproperties ('external.table.purge'='false')
> INFO  : Semantic Analysis Completed (retrial = false)
> 0: jdbc:hive2://localhost:11050> describe formatted test_parq_hive; 
> |                               | bucketing_version                           
>        | 2                     |                 
> |                               | external.table.purge                        
>        | TRUE                  |                                              
>                                                                               
>                     
> |                               | transient_lastDdlTime                       
>        | 1674011622            |    
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-26961) Fix improper replication metric count when hive.repl.filter.transactions is set to true.

2023-01-17 Thread Rakshith C (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakshith C reassigned HIVE-26961:
-


> Fix improper replication metric count when hive.repl.filter.transactions is 
> set to true.
> 
>
> Key: HIVE-26961
> URL: https://issues.apache.org/jira/browse/HIVE-26961
> Project: Hive
>  Issue Type: Bug
>Reporter: Rakshith C
>Assignee: Rakshith C
>Priority: Major
>
> Scenario:
> when hive.repl.filter.transactions = true, repl dump filters read only 
> transaction to improve thorughput. 
> Metrics logged to HMS are improper because there is a mismatch in count 
> between events read from notification logs and events dumped to staging 
> directory.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-26943) Fix NPE during Optimised Bootstrap when db is dropped

2023-01-17 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HIVE-26943:

Parent: HIVE-25699
Issue Type: Sub-task  (was: Task)

> Fix NPE during Optimised Bootstrap when db is dropped
> -
>
> Key: HIVE-26943
> URL: https://issues.apache.org/jira/browse/HIVE-26943
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Shreenidhi
>Assignee: Shreenidhi
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Consider the steps:
> 1. Current replication is from A (source) -> B(target)
> 2. Failover is complete
> so now           A (target) <- B(source)
> 3. Suppose db at A is dropped before reverse replication.
> 4. Now when reverse replication triggers optimised bootstrap it will throw NPE
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-26960) Optimized bootstrap does not drop newly added tables at source.

2023-01-17 Thread Rakshith C (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakshith C reassigned HIVE-26960:
-


> Optimized bootstrap does not drop newly added tables at source.
> ---
>
> Key: HIVE-26960
> URL: https://issues.apache.org/jira/browse/HIVE-26960
> Project: Hive
>  Issue Type: Bug
>Reporter: Rakshith C
>Assignee: Rakshith C
>Priority: Major
>
> Scenario:
> Replication is setup from DR to PROD after failover from PROD to DR and no 
> existing tables are modified at PROD but a new table is added at PROD.
> Observations:
>  * _bootstrap directory won't be created during second cycle of optimized 
> bootstrap because existing tables were not modified.
>  * Based on this, it will not initialize list of tables to drop at PROD.
>  * This leads to the new table created at PROD not being dropped.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26943) Fix NPE during Optimised Bootstrap when db is dropped

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26943?focusedWorklogId=839838=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839838
 ]

ASF GitHub Bot logged work on HIVE-26943:
-

Author: ASF GitHub Bot
Created on: 18/Jan/23 06:53
Start Date: 18/Jan/23 06:53
Worklog Time Spent: 10m 
  Work Description: shreenidhiSaigaonkar commented on code in PR #3953:
URL: https://github.com/apache/hive/pull/3953#discussion_r1073152555


##
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplLoadTask.java:
##
@@ -721,6 +721,10 @@ private int executeIncrementalLoad(long loadStartTime) 
throws Exception {
 Database targetDb = getHive().getDatabase(work.dbNameToLoadIn);
 Map props = new HashMap<>();
 
+if(targetDb == null) {

Review Comment:
   Done.





Issue Time Tracking
---

Worklog Id: (was: 839838)
Time Spent: 1h  (was: 50m)

> Fix NPE during Optimised Bootstrap when db is dropped
> -
>
> Key: HIVE-26943
> URL: https://issues.apache.org/jira/browse/HIVE-26943
> Project: Hive
>  Issue Type: Task
>Reporter: Shreenidhi
>Assignee: Shreenidhi
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Consider the steps:
> 1. Current replication is from A (source) -> B(target)
> 2. Failover is complete
> so now           A (target) <- B(source)
> 3. Suppose db at A is dropped before reverse replication.
> 4. Now when reverse replication triggers optimised bootstrap it will throw NPE
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26943) Fix NPE during Optimised Bootstrap when db is dropped

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26943?focusedWorklogId=839837=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839837
 ]

ASF GitHub Bot logged work on HIVE-26943:
-

Author: ASF GitHub Bot
Created on: 18/Jan/23 06:47
Start Date: 18/Jan/23 06:47
Worklog Time Spent: 10m 
  Work Description: pudidic commented on code in PR #3953:
URL: https://github.com/apache/hive/pull/3953#discussion_r1073149035


##
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplLoadTask.java:
##
@@ -721,6 +721,10 @@ private int executeIncrementalLoad(long loadStartTime) 
throws Exception {
 Database targetDb = getHive().getDatabase(work.dbNameToLoadIn);
 Map props = new HashMap<>();
 
+if(targetDb == null) {

Review Comment:
   Please follow the coding convention; have a whitespace after if.





Issue Time Tracking
---

Worklog Id: (was: 839837)
Time Spent: 50m  (was: 40m)

> Fix NPE during Optimised Bootstrap when db is dropped
> -
>
> Key: HIVE-26943
> URL: https://issues.apache.org/jira/browse/HIVE-26943
> Project: Hive
>  Issue Type: Task
>Reporter: Shreenidhi
>Assignee: Shreenidhi
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Consider the steps:
> 1. Current replication is from A (source) -> B(target)
> 2. Failover is complete
> so now           A (target) <- B(source)
> 3. Suppose db at A is dropped before reverse replication.
> 4. Now when reverse replication triggers optimised bootstrap it will throw NPE
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26952) set the value of metastore.storage.schema.reader.impl
 to org.apache.hadoop.hive.metastore.SerDeStorageSchemaReader as default

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26952?focusedWorklogId=839836=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839836
 ]

ASF GitHub Bot logged work on HIVE-26952:
-

Author: ASF GitHub Bot
Created on: 18/Jan/23 06:46
Start Date: 18/Jan/23 06:46
Worklog Time Spent: 10m 
  Work Description: SourabhBadhya commented on code in PR #3959:
URL: https://github.com/apache/hive/pull/3959#discussion_r1073148504


##
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java:
##
@@ -67,6 +67,9 @@ public class MetastoreConf {
   static final String DEFAULT_STORAGE_SCHEMA_READER_CLASS =
   "org.apache.hadoop.hive.metastore.DefaultStorageSchemaReader";
   @VisibleForTesting
+  static final String SERDE_STORAGE_SCHEMA_READER_CLASS =
+  "org.apache.hadoop.hive.metastore.SerDeStorageSchemaReader";

Review Comment:
   Since you are introducing this, could you please add assertion case here - 
Check the below -
   
https://github.com/apache/hive/blob/c92a478e514a28a53009fe5fbf08ce6fa35b58b9/standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/conf/TestMetastoreConf.java#L482





Issue Time Tracking
---

Worklog Id: (was: 839836)
Time Spent: 0.5h  (was: 20m)

> set the value of metastore.storage.schema.reader.impl
 to 
> org.apache.hadoop.hive.metastore.SerDeStorageSchemaReader as default
> --
>
> Key: HIVE-26952
> URL: https://issues.apache.org/jira/browse/HIVE-26952
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Reporter: Taraka Rama Rao Lethavadla
>Assignee: Taraka Rama Rao Lethavadla
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> With the default value of
>  
> {code:java}
> DefaultStorageSchemaReader.class.getName(){code}
>  
> in the Metastore Config, *metastore.storage.schema.reader.impl*
> below exception is thrown when trying to read Avro schema
> {noformat}
> Caused by: org.apache.hive.service.cli.HiveSQLException: MetaException 
> (message:java.lang.UnsupportedOperationException: Storage schema reading not 
> supported)
>     at 
> org.apache.hive.service.cli.operation.GetColumnsOperation.runInternal(GetColumnsOperation.java:213)
>     at org.apache.hive.service.cli.operation.Operation.run(Operation.java:247)
>     at 
> org.apache.hive.service.cli.session.HiveSessionImpl.getColumns(HiveSessionImpl.java:729)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
>     at 
> org.apache.hive.service.cli.session.HiveSessionProxy.access-zsh(HiveSessionProxy.java:36)
>     at 
> org.apache.hive.service.cli.session.HiveSessionProxy.run(HiveSessionProxy.java:63)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
>     at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>     at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
>     at com.sun.proxy..getColumns(Unknown Source)
>     at 
> org.apache.hive.service.cli.CLIService.getColumns(CLIService.java:390){noformat}
> setting the above config with 
> *org.apache.hadoop.hive.metastore.SerDeStorageSchemaReader* resolves issue
> Proposing to make this value as default in code base, so that in upcoming 
> versions we don't have to set this value manually



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26956) Improv find_in_set function

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26956?focusedWorklogId=839830=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839830
 ]

ASF GitHub Bot logged work on HIVE-26956:
-

Author: ASF GitHub Bot
Created on: 18/Jan/23 06:03
Start Date: 18/Jan/23 06:03
Worklog Time Spent: 10m 
  Work Description: sonarcloud[bot] commented on PR #3961:
URL: https://github.com/apache/hive/pull/3961#issuecomment-1386535042

   Kudos, SonarCloud Quality Gate passed!  [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive=3961)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=3961=false=BUG)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=3961=false=BUG)
 [0 
Bugs](https://sonarcloud.io/project/issues?id=apache_hive=3961=false=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=3961=false=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=3961=false=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=3961=false=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3961=false=SECURITY_HOTSPOT)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3961=false=SECURITY_HOTSPOT)
 [0 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3961=false=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive=3961=false=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=3961=false=CODE_SMELL)
 [0 Code 
Smells](https://sonarcloud.io/project/issues?id=apache_hive=3961=false=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive=3961=coverage=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive=3961=duplicated_lines_density=list)
 No Duplication information
   
   




Issue Time Tracking
---

Worklog Id: (was: 839830)
Time Spent: 50m  (was: 40m)

> Improv find_in_set function
> ---
>
> Key: HIVE-26956
> URL: https://issues.apache.org/jira/browse/HIVE-26956
> Project: Hive
>  Issue Type: Improvement
>Reporter: Bingye Chen
>Assignee: Bingye Chen
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Improv find_in_set function



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work started] (HIVE-26601) Fix NPE encountered in second load cycle of optimised bootstrap

2023-01-17 Thread Vinit Patni (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-26601 started by Vinit Patni.
--
> Fix NPE encountered in second load cycle of optimised bootstrap 
> 
>
> Key: HIVE-26601
> URL: https://issues.apache.org/jira/browse/HIVE-26601
> Project: Hive
>  Issue Type: Bug
>Reporter: Teddy Choi
>Assignee: Vinit Patni
>Priority: Blocker
>
> After creating reverse replication policy  after failover is completed from 
> Primary to DR cluster and DR takes over. First dump and load cycle of 
> optimised bootstrap is completing successfully, Second dump cycle on DR is 
> also completed which does selective bootstrap of tables that it read from 
> table_diff directory. However we observed issue with Second load cycle on 
> Primary Cluster side which is failing with following exception logs that 
> needs to be fixed.
> {code:java}
> [Scheduled Query Executor(schedule:repl_vinreverse, execution_id:421)]: 
> Exception while logging metrics 
> java.lang.NullPointerException: null
>   at 
> org.apache.hadoop.hive.ql.parse.repl.metric.ReplicationMetricCollector.reportStageProgress(ReplicationMetricCollector.java:192)
>  ~[hive-exec-3.1.3000.7.1.8.0-801.jar:3.1.3000.7.1.8.0-801]
>   at 
> org.apache.hadoop.hive.ql.exec.repl.ReplStateLogWork.replStateLog(ReplStateLogWork.java:145)
>  ~[hive-exec-3.1.3000.7.1.8.0-801.jar:3.1.3000.7.1.8.0-801]
>   at 
> org.apache.hadoop.hive.ql.exec.repl.ReplStateLogTask.execute(ReplStateLogTask.java:39)
>  [hive-exec-3.1.3000.7.1.8.0-801.jar:3.1.3000.7.1.8.0-801]
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213) 
> [hive-exec-3.1.3000.7.1.8.0-801.jar:3.1.3000.7.1.8.0-801]
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) 
> [hive-exec-3.1.3000.7.1.8.0-801.jar:3.1.3000.7.1.8.0-801]
>   at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:357) 
> [hive-exec-3.1.3000.7.1.8.0-801.jar:3.1.3000.7.1.8.0-801]
>   at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330) 
> [hive-exec-3.1.3000.7.1.8.0-801.jar:3.1.3000.7.1.8.0-801]
>   at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246) 
> [hive-exec-3.1.3000.7.1.8.0-801.jar:3.1.3000.7.1.8.0-801]
>   at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109) 
> [hive-exec-3.1.3000.7.1.8.0-801.jar:3.1.3000.7.1.8.0-801]
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:749) 
> [hive-exec-3.1.3000.7.1.8.0-801.jar:3.1.3000.7.1.8.0-801]
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:504) 
> [hive-exec-3.1.3000.7.1.8.0-801.jar:3.1.3000.7.1.8.0-801]
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:498) 
> [hive-exec-3.1.3000.7.1.8.0-801.jar:3.1.3000.7.1.8.0-801]
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166) 
> [hive-exec-3.1.3000.7.1.8.0-801.jar:3.1.3000.7.1.8.0-801]
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:232) 
> [hive-exec-3.1.3000.7.1.8.0-801.jar:3.1.3000.7.1.8.0-801]
>   at 
> org.apache.hadoop.hive.ql.scheduled.ScheduledQueryExecutionService$ScheduledQueryExecutor.processQuery(ScheduledQueryExecutionService.java:240)
>  [hive-exec-3.1.3000.7.1.8.0-801.jar:3.1.3000.7.1.8.0-801]
>   at 
> org.apache.hadoop.hive.ql.scheduled.ScheduledQueryExecutionService$ScheduledQueryExecutor.run(ScheduledQueryExecutionService.java:193)
>  [hive-exec-3.1.3000.7.1.8.0-801.jar:3.1.3000.7.1.8.0-801]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [?:1.8.0_232]
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> [?:1.8.0_232]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [?:1.8.0_232]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [?:1.8.0_232]
>   at java.lang.Thread.run(Thread.java:748) [?:1.8.0_232]{code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-26599) Fix NPE encountered in second dump cycle of optimised bootstrap

2023-01-17 Thread Vinit Patni (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinit Patni updated HIVE-26599:
---
Status: Patch Available  (was: In Progress)

> Fix NPE encountered in second dump cycle of optimised bootstrap
> ---
>
> Key: HIVE-26599
> URL: https://issues.apache.org/jira/browse/HIVE-26599
> Project: Hive
>  Issue Type: Bug
>Reporter: Teddy Choi
>Assignee: Vinit Patni
>Priority: Blocker
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> After creating reverse replication policy  after failover is completed from 
> Primary to DR cluster and DR takes over. First dump and load cycle of 
> optimised bootstrap is completing successfully, But We are encountering Null 
> pointer exception in the second dump cycle which is halting this reverse 
> replication and major blocker to test complete cycle of replication. 
> {code:java}
> Scheduled Query Executor(schedule:repl_reverse, execution_id:14)]: FAILED: 
> Execution Error, return code -101 from 
> org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask. 
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.parse.repl.metric.ReplicationMetricCollector.reportStageProgress(ReplicationMetricCollector.java:192)
> at 
> org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.dumpTable(ReplDumpTask.java:1458)
> at 
> org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.incrementalDump(ReplDumpTask.java:961)
> at 
> org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.execute(ReplDumpTask.java:290)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105)
> at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:357)
> at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330)
> at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246)
> at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:749)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:504)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:498)
> at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166)
> at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:232){code}
> After doing RCA, we figured out that In second dump cycle on DR cluster when 
> StageStart method is invoked by code,  metrics corresponding to Tables is not 
> being registered (which should be registered as we are doing selective 
> bootstrap of tables for optimise bootstrap along with incremental dump) which 
> is causing NPE when it is trying to update the progress corresponding to this 
> metric latter on after bootstrap of table is completed. 
> Fix is to register the Tables metric before updating the progress.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work started] (HIVE-26599) Fix NPE encountered in second dump cycle of optimised bootstrap

2023-01-17 Thread Vinit Patni (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-26599 started by Vinit Patni.
--
> Fix NPE encountered in second dump cycle of optimised bootstrap
> ---
>
> Key: HIVE-26599
> URL: https://issues.apache.org/jira/browse/HIVE-26599
> Project: Hive
>  Issue Type: Bug
>Reporter: Teddy Choi
>Assignee: Vinit Patni
>Priority: Blocker
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> After creating reverse replication policy  after failover is completed from 
> Primary to DR cluster and DR takes over. First dump and load cycle of 
> optimised bootstrap is completing successfully, But We are encountering Null 
> pointer exception in the second dump cycle which is halting this reverse 
> replication and major blocker to test complete cycle of replication. 
> {code:java}
> Scheduled Query Executor(schedule:repl_reverse, execution_id:14)]: FAILED: 
> Execution Error, return code -101 from 
> org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask. 
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.parse.repl.metric.ReplicationMetricCollector.reportStageProgress(ReplicationMetricCollector.java:192)
> at 
> org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.dumpTable(ReplDumpTask.java:1458)
> at 
> org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.incrementalDump(ReplDumpTask.java:961)
> at 
> org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.execute(ReplDumpTask.java:290)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105)
> at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:357)
> at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330)
> at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246)
> at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:749)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:504)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:498)
> at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166)
> at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:232){code}
> After doing RCA, we figured out that In second dump cycle on DR cluster when 
> StageStart method is invoked by code,  metrics corresponding to Tables is not 
> being registered (which should be registered as we are doing selective 
> bootstrap of tables for optimise bootstrap along with incremental dump) which 
> is causing NPE when it is trying to update the progress corresponding to this 
> metric latter on after bootstrap of table is completed. 
> Fix is to register the Tables metric before updating the progress.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26599) Fix NPE encountered in second dump cycle of optimised bootstrap

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26599?focusedWorklogId=839826=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839826
 ]

ASF GitHub Bot logged work on HIVE-26599:
-

Author: ASF GitHub Bot
Created on: 18/Jan/23 05:32
Start Date: 18/Jan/23 05:32
Worklog Time Spent: 10m 
  Work Description: vinitpatni opened a new pull request, #3963:
URL: https://github.com/apache/hive/pull/3963

   …d bootstrap
   
   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   




Issue Time Tracking
---

Worklog Id: (was: 839826)
Remaining Estimate: 0h
Time Spent: 10m

> Fix NPE encountered in second dump cycle of optimised bootstrap
> ---
>
> Key: HIVE-26599
> URL: https://issues.apache.org/jira/browse/HIVE-26599
> Project: Hive
>  Issue Type: Bug
>Reporter: Teddy Choi
>Assignee: Vinit Patni
>Priority: Blocker
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> After creating reverse replication policy  after failover is completed from 
> Primary to DR cluster and DR takes over. First dump and load cycle of 
> optimised bootstrap is completing successfully, But We are encountering Null 
> pointer exception in the second dump cycle which is halting this reverse 
> replication and major blocker to test complete cycle of replication. 
> {code:java}
> Scheduled Query Executor(schedule:repl_reverse, execution_id:14)]: FAILED: 
> Execution Error, return code -101 from 
> org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask. 
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.parse.repl.metric.ReplicationMetricCollector.reportStageProgress(ReplicationMetricCollector.java:192)
> at 
> org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.dumpTable(ReplDumpTask.java:1458)
> at 
> org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.incrementalDump(ReplDumpTask.java:961)
> at 
> org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.execute(ReplDumpTask.java:290)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105)
> at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:357)
> at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330)
> at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246)
> at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:749)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:504)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:498)
> at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166)
> at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:232){code}
> After doing RCA, we figured out that In second dump cycle on DR cluster when 
> StageStart method is invoked by code,  metrics corresponding to Tables is not 
> being registered (which should be registered as we are doing selective 
> bootstrap of tables for optimise bootstrap along with incremental dump) which 
> is causing NPE when it is trying to update the progress corresponding to this 
> metric latter on after bootstrap of table is completed. 
> Fix is to register the Tables metric before updating the progress.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-26599) Fix NPE encountered in second dump cycle of optimised bootstrap

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-26599:
--
Labels: pull-request-available  (was: )

> Fix NPE encountered in second dump cycle of optimised bootstrap
> ---
>
> Key: HIVE-26599
> URL: https://issues.apache.org/jira/browse/HIVE-26599
> Project: Hive
>  Issue Type: Bug
>Reporter: Teddy Choi
>Assignee: Vinit Patni
>Priority: Blocker
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> After creating reverse replication policy  after failover is completed from 
> Primary to DR cluster and DR takes over. First dump and load cycle of 
> optimised bootstrap is completing successfully, But We are encountering Null 
> pointer exception in the second dump cycle which is halting this reverse 
> replication and major blocker to test complete cycle of replication. 
> {code:java}
> Scheduled Query Executor(schedule:repl_reverse, execution_id:14)]: FAILED: 
> Execution Error, return code -101 from 
> org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask. 
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.parse.repl.metric.ReplicationMetricCollector.reportStageProgress(ReplicationMetricCollector.java:192)
> at 
> org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.dumpTable(ReplDumpTask.java:1458)
> at 
> org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.incrementalDump(ReplDumpTask.java:961)
> at 
> org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.execute(ReplDumpTask.java:290)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105)
> at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:357)
> at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330)
> at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246)
> at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:749)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:504)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:498)
> at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166)
> at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:232){code}
> After doing RCA, we figured out that In second dump cycle on DR cluster when 
> StageStart method is invoked by code,  metrics corresponding to Tables is not 
> being registered (which should be registered as we are doing selective 
> bootstrap of tables for optimise bootstrap along with incremental dump) which 
> is causing NPE when it is trying to update the progress corresponding to this 
> metric latter on after bootstrap of table is completed. 
> Fix is to register the Tables metric before updating the progress.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HIVE-26597) Fix unsetting of db prop repl.target.for in ReplicationSemanticAnalyzer

2023-01-17 Thread Rakshith C (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakshith C resolved HIVE-26597.
---
Resolution: Fixed

> Fix unsetting of db prop repl.target.for in ReplicationSemanticAnalyzer
> ---
>
> Key: HIVE-26597
> URL: https://issues.apache.org/jira/browse/HIVE-26597
> Project: Hive
>  Issue Type: Bug
>Reporter: Teddy Choi
>Assignee: Rakshith C
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> when repl policy is set from A -> B
>  * *repl.target.for* is set on B.
> when failover is initiated
>  * *repl.failover.endpoint* = *'TARGET'* is set on B.
>  
> now when reverse policy is set up from {*}A <- B{*};
> there is a check in 
> [ReplicationSemanticAnalyzer#initReplDump|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSemanticAnalyzer.java#L196]
>  which checks for existence of these two properties and if they are set,
> it unsets the *repl.target.for* property.
> Because of this optimisedBootstrap won't be triggered because it checks for 
> the existence of *repl.target.for* property during repl dump on target 
> [HERE|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/repl/OptimisedBootstrapUtils.java#L93].
>  
> Fix : remove the code which unsets repl.target.for in 
> ReplicationSemanticAnalyzer, because second dump cycle of optimized bootstrap 
> unsets it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26597) Fix unsetting of db prop repl.target.for in ReplicationSemanticAnalyzer

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26597?focusedWorklogId=839824=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839824
 ]

ASF GitHub Bot logged work on HIVE-26597:
-

Author: ASF GitHub Bot
Created on: 18/Jan/23 04:35
Start Date: 18/Jan/23 04:35
Worklog Time Spent: 10m 
  Work Description: pudidic merged PR #3788:
URL: https://github.com/apache/hive/pull/3788




Issue Time Tracking
---

Worklog Id: (was: 839824)
Time Spent: 50m  (was: 40m)

> Fix unsetting of db prop repl.target.for in ReplicationSemanticAnalyzer
> ---
>
> Key: HIVE-26597
> URL: https://issues.apache.org/jira/browse/HIVE-26597
> Project: Hive
>  Issue Type: Bug
>Reporter: Teddy Choi
>Assignee: Rakshith C
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> when repl policy is set from A -> B
>  * *repl.target.for* is set on B.
> when failover is initiated
>  * *repl.failover.endpoint* = *'TARGET'* is set on B.
>  
> now when reverse policy is set up from {*}A <- B{*};
> there is a check in 
> [ReplicationSemanticAnalyzer#initReplDump|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSemanticAnalyzer.java#L196]
>  which checks for existence of these two properties and if they are set,
> it unsets the *repl.target.for* property.
> Because of this optimisedBootstrap won't be triggered because it checks for 
> the existence of *repl.target.for* property during repl dump on target 
> [HERE|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/repl/OptimisedBootstrapUtils.java#L93].
>  
> Fix : remove the code which unsets repl.target.for in 
> ReplicationSemanticAnalyzer, because second dump cycle of optimized bootstrap 
> unsets it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26597) Fix unsetting of db prop repl.target.for in ReplicationSemanticAnalyzer

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26597?focusedWorklogId=839823=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839823
 ]

ASF GitHub Bot logged work on HIVE-26597:
-

Author: ASF GitHub Bot
Created on: 18/Jan/23 04:34
Start Date: 18/Jan/23 04:34
Worklog Time Spent: 10m 
  Work Description: pudidic commented on PR #3788:
URL: https://github.com/apache/hive/pull/3788#issuecomment-1386474212

   LGTM +1. I'll merge it as it's a trivial change. Thank you. :)




Issue Time Tracking
---

Worklog Id: (was: 839823)
Time Spent: 40m  (was: 0.5h)

> Fix unsetting of db prop repl.target.for in ReplicationSemanticAnalyzer
> ---
>
> Key: HIVE-26597
> URL: https://issues.apache.org/jira/browse/HIVE-26597
> Project: Hive
>  Issue Type: Bug
>Reporter: Teddy Choi
>Assignee: Rakshith C
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> when repl policy is set from A -> B
>  * *repl.target.for* is set on B.
> when failover is initiated
>  * *repl.failover.endpoint* = *'TARGET'* is set on B.
>  
> now when reverse policy is set up from {*}A <- B{*};
> there is a check in 
> [ReplicationSemanticAnalyzer#initReplDump|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSemanticAnalyzer.java#L196]
>  which checks for existence of these two properties and if they are set,
> it unsets the *repl.target.for* property.
> Because of this optimisedBootstrap won't be triggered because it checks for 
> the existence of *repl.target.for* property during repl dump on target 
> [HERE|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/repl/OptimisedBootstrapUtils.java#L93].
>  
> Fix : remove the code which unsets repl.target.for in 
> ReplicationSemanticAnalyzer, because second dump cycle of optimized bootstrap 
> unsets it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26956) Improv find_in_set function

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26956?focusedWorklogId=839814=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839814
 ]

ASF GitHub Bot logged work on HIVE-26956:
-

Author: ASF GitHub Bot
Created on: 18/Jan/23 03:23
Start Date: 18/Jan/23 03:23
Worklog Time Spent: 10m 
  Work Description: TaoZex opened a new pull request, #3961:
URL: https://github.com/apache/hive/pull/3961

   
   
   ### What changes were proposed in this pull request?
   
   
   Improve find_in_set function
   
   ### Why are the changes needed?
   
   
   Code redundancy
   
   ### Does this PR introduce _any_ user-facing change?
   
   no
   
   ### How was this patch tested?
   
   




Issue Time Tracking
---

Worklog Id: (was: 839814)
Time Spent: 40m  (was: 0.5h)

> Improv find_in_set function
> ---
>
> Key: HIVE-26956
> URL: https://issues.apache.org/jira/browse/HIVE-26956
> Project: Hive
>  Issue Type: Improvement
>Reporter: Bingye Chen
>Assignee: Bingye Chen
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Improv find_in_set function



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26956) Improv find_in_set function

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26956?focusedWorklogId=839813=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839813
 ]

ASF GitHub Bot logged work on HIVE-26956:
-

Author: ASF GitHub Bot
Created on: 18/Jan/23 03:23
Start Date: 18/Jan/23 03:23
Worklog Time Spent: 10m 
  Work Description: TaoZex closed pull request #3961: HIVE-26956: Improve 
find_in_set function
URL: https://github.com/apache/hive/pull/3961




Issue Time Tracking
---

Worklog Id: (was: 839813)
Time Spent: 0.5h  (was: 20m)

> Improv find_in_set function
> ---
>
> Key: HIVE-26956
> URL: https://issues.apache.org/jira/browse/HIVE-26956
> Project: Hive
>  Issue Type: Improvement
>Reporter: Bingye Chen
>Assignee: Bingye Chen
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Improv find_in_set function



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26739) When kerberos is enabled, hiveserver2 error connecting metastore: No valid credentials provided

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26739?focusedWorklogId=839812=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839812
 ]

ASF GitHub Bot logged work on HIVE-26739:
-

Author: ASF GitHub Bot
Created on: 18/Jan/23 03:17
Start Date: 18/Jan/23 03:17
Worklog Time Spent: 10m 
  Work Description: xiuzhu9527 commented on PR #3764:
URL: https://github.com/apache/hive/pull/3764#issuecomment-1386425591

   thx!




Issue Time Tracking
---

Worklog Id: (was: 839812)
Time Spent: 50m  (was: 40m)

> When kerberos is enabled, hiveserver2 error connecting metastore: No valid 
> credentials provided
> ---
>
> Key: HIVE-26739
> URL: https://issues.apache.org/jira/browse/HIVE-26739
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0, 3.0.0
>Reporter: weiliang hao
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> If the environment variable HADOOP_USER_NAME exists, hiveserver2 error 
> connecting metastore: No valid credentials provided.
> There is a problem with the getUGI method of the 
> org.apache.hadoop.hive.shims.Utils class to obtain the UGI. It should be 
> added to determine whether 'UserGroupInformation IsSecurityEnabled () `. If 
> it is true, it returns' UserGroupInformation GetCurrentUser() `. If it is 
> false, the user name is obtained from the environment variable 
> HADOOP_USER_NAME to create a UGI
>  
> {code:java}
> 2022-11-15T15:41:06,971 ERROR [HiveServer2-Background-Pool: Thread-36] 
> transport.TSaslTransport: SASL negotiation failure
> javax.security.sasl.SaslException: GSS initiate failed
>         at 
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
>  ~[?:1.8.0_144]
>         at 
> org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94)
>  ~[hive-exec-3.1.3.jar:3.1.3]
>         at 
> org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) 
> ~[hive-exec-3.1.3.jar:3.1.3]
>         at 
> org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
>  ~[hive-exec-3.1.3.jar:3.1.3]
>         at 
> org.apache.hadoop.hive.metastore.security.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:51)
>  ~[hive-exec-3.1.3.jar:3.1.3]
>         at 
> org.apache.hadoop.hive.metastore.security.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:48)
>  ~[hive-exec-3.1.3.jar:3.1.3]
>         at java.security.AccessController.doPrivileged(Native Method) 
> ~[?:1.8.0_144]
>         at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_144]
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>  ~[hadoop-common-3.2.1.jar:?]
>         at 
> org.apache.hadoop.hive.metastore.security.TUGIAssumingTransport.open(TUGIAssumingTransport.java:48)
>  ~[hive-exec-3.1.3.jar:3.1.3]
>         at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:516)
>  ~[hive-exec-3.1.3.jar:3.1.3]
>         at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:224)
>  ~[hive-exec-3.1.3.jar:3.1.3]
>         at 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.(SessionHiveMetaStoreClient.java:94)
>  ~[hive-exec-3.1.3.jar:3.1.3]
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method) ~[?:1.8.0_144]
>         at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>  ~[?:1.8.0_144]
>         at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  ~[?:1.8.0_144]
>         at java.lang.reflect.Constructor.newInstance(Constructor.java:423) 
> ~[?:1.8.0_144]
>         at 
> org.apache.hadoop.hive.metastore.utils.JavaUtils.newInstance(JavaUtils.java:84)
>  ~[hive-exec-3.1.3.jar:3.1.3]
>         at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:95)
>  ~[hive-exec-3.1.3.jar:3.1.3]
>         at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:148)
>  ~[hive-exec-3.1.3.jar:3.1.3]
>         at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:119)
>  ~[hive-exec-3.1.3.jar:3.1.3]
>         at 
> org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:4306) 
> ~[hive-exec-3.1.3.jar:3.1.3]
>         at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:4374) 
> ~[hive-exec-3.1.3.jar:3.1.3]
>         at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:4354) 
> ~[hive-exec-3.1.3.jar:3.1.3]
>         at 
> 

[jira] [Work logged] (HIVE-26904) QueryCompactor failed in commitCompaction if the tmp table dir is already removed

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26904?focusedWorklogId=839811=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839811
 ]

ASF GitHub Bot logged work on HIVE-26904:
-

Author: ASF GitHub Bot
Created on: 18/Jan/23 03:10
Start Date: 18/Jan/23 03:10
Worklog Time Spent: 10m 
  Work Description: stiga-huang commented on code in PR #3910:
URL: https://github.com/apache/hive/pull/3910#discussion_r1073054442


##
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/QueryCompactor.java:
##
@@ -245,16 +243,32 @@ private static void disableLlapCaching(HiveConf conf) {
  * @throws IOException the directory cannot be deleted
  * @throws HiveException the table is not found
  */
-static void cleanupEmptyDir(HiveConf conf, String tmpTableName) throws 
IOException, HiveException {
+static void cleanupEmptyTableDir(HiveConf conf, String tmpTableName)
+throws IOException, HiveException {
   org.apache.hadoop.hive.ql.metadata.Table tmpTable = 
Hive.get().getTable(tmpTableName);
   if (tmpTable != null) {
-Path path = new Path(tmpTable.getSd().getLocation());
-FileSystem fs = path.getFileSystem(conf);
+cleanupEmptyDir(conf, new Path(tmpTable.getSd().getLocation()));
+  }
+}
+
+/**
+ * Remove the directory if it's empty.
+ * @param conf the Hive configuration
+ * @param path path of the directory
+ * @throws IOException if any IO error occurs
+ */
+static void cleanupEmptyDir(HiveConf conf, Path path) throws IOException {
+  FileSystem fs = path.getFileSystem(conf);
+  try {
 if (!fs.listFiles(path, false).hasNext()) {
   fs.delete(path, true);
 }
+  } catch (FileNotFoundException e) {
+// Ignore the case when the dir was already removed
+LOG.warn("Ignored exception during cleanup {}", path, e);

Review Comment:
   FWIW, the following log shows the stacktrace of where the 
`FileNotFoundException` is thrown:
   ```
   2023-01-02T02:12:55,849 ERROR 
[impala-ec2-centos79-m6i-4xlarge-ondemand-1428.vpc.cloudera.com-48_executor] 
compactor.Worker: Caught exception while trying to compact 
id:15,dbname:partial_catalog_info_test,tableName:insert_only_partitioned,partName:part=1,state:^@,type:MINOR,enqueueTime:0,start:0,properties:null,runAs:jenkins,tooManyAborts:false,hasOldAbort:false,highestWriteId:3,errorMessage:null,workerId:
 null,initiatorId: null,retryRetention0. Marking failed to avoid repeated 
failures
   java.io.FileNotFoundException: File 
hdfs://localhost:20500/tmp/hive/jenkins/092b533a-81c8-4b95-88e4-9472cf6f365d/_tmp_space.db/62ec04fb-e2d2-4a99-a454-ae709a3cccfe
 does not exist.
   at 
org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1275)
 ~[hadoop-hdfs-client-3.1.1.7.2.15.4-6.jar:?]
   at 
org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1249)
 ~[hadoop-hdfs-client-3.1.1.7.2.15.4-6.jar:?]
   at 
org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1194)
 ~[hadoop-hdfs-client-3.1.1.7.2.15.4-6.jar:?]
   at 
org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1190)
 ~[hadoop-hdfs-client-3.1.1.7.2.15.4-6.jar:?]
   at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
 ~[hadoop-common-3.1.1.7.2.15.4-6.jar:?]
   at 
org.apache.hadoop.hdfs.DistributedFileSystem.listLocatedStatus(DistributedFileSystem.java:1208)
 ~[hadoop-hdfs-client-3.1.1.7.2.15.4-6.jar:?]
   at 
org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:2144) 
~[hadoop-common-3.1.1.7.2.15.4-6.jar:?]
   at org.apache.hadoop.fs.FileSystem$5.(FileSystem.java:2302) 
~[hadoop-common-3.1.1.7.2.15.4-6.jar:?]
   at org.apache.hadoop.fs.FileSystem.listFiles(FileSystem.java:2299) 
~[hadoop-common-3.1.1.7.2.15.4-6.jar:?]
   at 
org.apache.hadoop.hive.ql.txn.compactor.QueryCompactor$Util.cleanupEmptyDir(QueryCompactor.java:261)
 ~[hive-exec-3.1.3000.2022.0.13.0-60.jar:3.1.3000.2022.0.13.0-60]
   at 
org.apache.hadoop.hive.ql.txn.compactor.MmMinorQueryCompactor.commitCompaction(MmMinorQueryCompactor.java:72)
 ~[hive-exec-3.1.3000.2022.0.13.0-60.jar:3.1.3000.2022.0.13.0-60]
   at 
org.apache.hadoop.hive.ql.txn.compactor.QueryCompactor.runCompactionQueries(QueryCompactor.java:146)
 ~[hive-exec-3.1.3000.2022.0.13.0-60.jar:3.1.3000.2022.0.13.0-60]
   at 
org.apache.hadoop.hive.ql.txn.compactor.MmMinorQueryCompactor.runCompaction(MmMinorQueryCompactor.java:63)
 ~[hive-exec-3.1.3000.2022.0.13.0-60.jar:3.1.3000.2022.0.13.0-60]
   at 
org.apache.hadoop.hive.ql.txn.compactor.Worker.findNextCompactionAndExecute(Worker.java:435)
 ~[hive-exec-3.1.3000.2022.0.13.0-60.jar:3.1.3000.2022.0.13.0-60]
 

[jira] [Work logged] (HIVE-26915) Backport of HIVE-23692 TestCodahaleMetrics.testFileReporting is flaky

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26915?focusedWorklogId=839810=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839810
 ]

ASF GitHub Bot logged work on HIVE-26915:
-

Author: ASF GitHub Bot
Created on: 18/Jan/23 02:46
Start Date: 18/Jan/23 02:46
Worklog Time Spent: 10m 
  Work Description: amanraj2520 commented on PR #3928:
URL: https://github.com/apache/hive/pull/3928#issuecomment-1386397424

   @zabetak @abstractdog Can you please review and merge this




Issue Time Tracking
---

Worklog Id: (was: 839810)
Time Spent: 1h 10m  (was: 1h)

> Backport of HIVE-23692 TestCodahaleMetrics.testFileReporting is flaky
> -
>
> Key: HIVE-26915
> URL: https://issues.apache.org/jira/browse/HIVE-26915
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Aman Raj
>Assignee: Aman Raj
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> This was committed in master without a HIVE Jira task. This is the commit id 
> : 130f80445d589cdd82904cea1073c84d1368d079



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26915) Backport of HIVE-23692 TestCodahaleMetrics.testFileReporting is flaky

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26915?focusedWorklogId=839809=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839809
 ]

ASF GitHub Bot logged work on HIVE-26915:
-

Author: ASF GitHub Bot
Created on: 18/Jan/23 02:45
Start Date: 18/Jan/23 02:45
Worklog Time Spent: 10m 
  Work Description: amanraj2520 commented on PR #3928:
URL: https://github.com/apache/hive/pull/3928#issuecomment-1386395934

   @zabetak Here is another flakiness 
http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-3954/4/tests




Issue Time Tracking
---

Worklog Id: (was: 839809)
Time Spent: 1h  (was: 50m)

> Backport of HIVE-23692 TestCodahaleMetrics.testFileReporting is flaky
> -
>
> Key: HIVE-26915
> URL: https://issues.apache.org/jira/browse/HIVE-26915
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Aman Raj
>Assignee: Aman Raj
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> This was committed in master without a HIVE Jira task. This is the commit id 
> : 130f80445d589cdd82904cea1073c84d1368d079



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26945) Test fixes for query*.q files

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26945?focusedWorklogId=839808=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839808
 ]

ASF GitHub Bot logged work on HIVE-26945:
-

Author: ASF GitHub Bot
Created on: 18/Jan/23 02:42
Start Date: 18/Jan/23 02:42
Worklog Time Spent: 10m 
  Work Description: amanraj2520 commented on PR #3954:
URL: https://github.com/apache/hive/pull/3954#issuecomment-1386394089

   Hi @zabetak @abstractdog Can you please approve this. There is one flaky 
test that is failing. I have fixed that in 
https://github.com/apache/hive/pull/3928




Issue Time Tracking
---

Worklog Id: (was: 839808)
Time Spent: 20m  (was: 10m)

> Test fixes for query*.q files
> -
>
> Key: HIVE-26945
> URL: https://issues.apache.org/jira/browse/HIVE-26945
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Aman Raj
>Assignee: Aman Raj
>Priority: Critical
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The tests has outdated q.out files which need to be updated.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-26893) Extend batch partition APIs to ignore partition schemas

2023-01-17 Thread Sai Hemanth Gantasala (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sai Hemanth Gantasala reassigned HIVE-26893:


Assignee: Sai Hemanth Gantasala

> Extend batch partition APIs to ignore partition schemas
> ---
>
> Key: HIVE-26893
> URL: https://issues.apache.org/jira/browse/HIVE-26893
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore
>Reporter: Quanlong Huang
>Assignee: Sai Hemanth Gantasala
>Priority: Major
>
> There are several HMS APIs that return a list of partitions, e.g. 
> get_partitions_ps(), get_partitions_by_names(), add_partitions_req() with 
> needResult=true, etc. Each partition instance will have a unique list of 
> FieldSchemas as the partition schema:
> {code:java}
> org.apache.hadoop.hive.metastore.api.Partition
> -> org.apache.hadoop.hive.metastore.api.StorageDescriptor
>->  cols: list {code}
> This could occupy a large memory footprint for wide tables (e.g. with 2k 
> cols). See the heap histogram in IMPALA-11812 as an example.
> Some engines like Impala doesn't actually use/respect the partition level 
> schema. It's a waste of network/serde resource to transmit them. It'd be nice 
> if these APIs provide an optional boolean flag for ignoring partition 
> schemas. So HMS clients (e.g. Impala) don't need to clear them later (to save 
> mem).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26648) Upgrade Bouncy Castle to 1.70 due to high CVEs

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26648?focusedWorklogId=839786=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839786
 ]

ASF GitHub Bot logged work on HIVE-26648:
-

Author: ASF GitHub Bot
Created on: 18/Jan/23 00:21
Start Date: 18/Jan/23 00:21
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #3744: 
HIVE-26648:removing direct depedency of bouncycastle
URL: https://github.com/apache/hive/pull/3744




Issue Time Tracking
---

Worklog Id: (was: 839786)
Time Spent: 2.5h  (was: 2h 20m)

>  Upgrade Bouncy Castle to 1.70 due to high CVEs
> ---
>
> Key: HIVE-26648
> URL: https://issues.apache.org/jira/browse/HIVE-26648
> Project: Hive
>  Issue Type: Task
>Reporter: Devaspati Krishnatri
>Assignee: Devaspati Krishnatri
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26598) Fix unsetting of db params for optimized bootstrap when repl dump initiates data copy

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26598?focusedWorklogId=839784=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839784
 ]

ASF GitHub Bot logged work on HIVE-26598:
-

Author: ASF GitHub Bot
Created on: 18/Jan/23 00:21
Start Date: 18/Jan/23 00:21
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on PR #3780:
URL: https://github.com/apache/hive/pull/3780#issuecomment-1386276643

   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.




Issue Time Tracking
---

Worklog Id: (was: 839784)
Time Spent: 0.5h  (was: 20m)

> Fix unsetting of db params for optimized bootstrap when repl dump initiates 
> data copy
> -
>
> Key: HIVE-26598
> URL: https://issues.apache.org/jira/browse/HIVE-26598
> Project: Hive
>  Issue Type: Bug
>Reporter: Teddy Choi
>Assignee: Rakshith C
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> when hive.repl.run.data.copy.tasks.on.target is set to false, repl dump task 
> will initiate the copy task from source cluster to staging directory.
> In current code flow repl dump task dumps the metadata and then creates 
> another repl dump task with datacopyIterators initialized.
> when the second dump cycle executes, it directly begins data copy tasks. 
> Because of this we don't enter second reverse dump flow and  
> unsetDbPropertiesForOptimisedBootstrap is never set to true again.
> this results in db params (repl.target.for, repl.background.threads, etc) not 
> being unset.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26648) Upgrade Bouncy Castle to 1.70 due to high CVEs

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26648?focusedWorklogId=839787=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839787
 ]

ASF GitHub Bot logged work on HIVE-26648:
-

Author: ASF GitHub Bot
Created on: 18/Jan/23 00:21
Start Date: 18/Jan/23 00:21
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #3727: 
HIVE-26648:Upgrade Bouncy Castle to 1.70 due to high CVEs
URL: https://github.com/apache/hive/pull/3727




Issue Time Tracking
---

Worklog Id: (was: 839787)
Time Spent: 2h 40m  (was: 2.5h)

>  Upgrade Bouncy Castle to 1.70 due to high CVEs
> ---
>
> Key: HIVE-26648
> URL: https://issues.apache.org/jira/browse/HIVE-26648
> Project: Hive
>  Issue Type: Task
>Reporter: Devaspati Krishnatri
>Assignee: Devaspati Krishnatri
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26757) Add sfs+ofs support

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26757?focusedWorklogId=839785=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839785
 ]

ASF GitHub Bot logged work on HIVE-26757:
-

Author: ASF GitHub Bot
Created on: 18/Jan/23 00:21
Start Date: 18/Jan/23 00:21
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on PR #3779:
URL: https://github.com/apache/hive/pull/3779#issuecomment-1386276682

   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.




Issue Time Tracking
---

Worklog Id: (was: 839785)
Time Spent: 1h  (was: 50m)

> Add sfs+ofs support
> ---
>
> Key: HIVE-26757
> URL: https://issues.apache.org/jira/browse/HIVE-26757
> Project: Hive
>  Issue Type: Improvement
>Reporter: Michael Smith
>Assignee: Michael Smith
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> [https://github.com/apache/hive/blob/ebb1e2fa9914bcccecad261d53338933b699ccb1/ql/src/java/org/apache/hadoop/hive/ql/io/SingleFileSystem.java#L80]
>  shows SFS support for Ozone's o3fs protocol, but not the newer ofs protocol. 
> Please add support for {{{}sfs+ofs{}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26925) MV with iceberg storage format fails when contains 'PARTITIONED ON' clause due to column number/types difference.

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26925?focusedWorklogId=839778=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839778
 ]

ASF GitHub Bot logged work on HIVE-26925:
-

Author: ASF GitHub Bot
Created on: 17/Jan/23 23:13
Start Date: 17/Jan/23 23:13
Worklog Time Spent: 10m 
  Work Description: sonarcloud[bot] commented on PR #3939:
URL: https://github.com/apache/hive/pull/3939#issuecomment-1386210004

   Kudos, SonarCloud Quality Gate passed!  [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive=3939)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=3939=false=BUG)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=3939=false=BUG)
 [0 
Bugs](https://sonarcloud.io/project/issues?id=apache_hive=3939=false=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=3939=false=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=3939=false=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=3939=false=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3939=false=SECURITY_HOTSPOT)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3939=false=SECURITY_HOTSPOT)
 [0 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3939=false=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive=3939=false=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=3939=false=CODE_SMELL)
 [0 Code 
Smells](https://sonarcloud.io/project/issues?id=apache_hive=3939=false=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive=3939=coverage=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive=3939=duplicated_lines_density=list)
 No Duplication information
   
   




Issue Time Tracking
---

Worklog Id: (was: 839778)
Time Spent: 1h 50m  (was: 1h 40m)

> MV with iceberg storage format fails when contains 'PARTITIONED ON' clause 
> due to column number/types difference.
> -
>
> Key: HIVE-26925
> URL: https://issues.apache.org/jira/browse/HIVE-26925
> Project: Hive
>  Issue Type: Bug
>  Components: Iceberg integration
>Reporter: Dharmik Thakkar
>Assignee: Krisztian Kasa
>Priority: Critical
>  Labels: pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> MV with iceberg storage format fails when contains 'PARTITIONED ON' clause 
> due to column number/types difference.
> {code:java}
> !!! annotations iceberg
> >>> use iceberg_test_db_hive;
> No rows affected
> >>> set hive.exec.max.dynamic.partitions=2000;
> >>> set hive.exec.max.dynamic.partitions.pernode=2000;
> >>> drop materialized view if exists mv_agg_gby_col_partitioned;
> >>> create materialized view mv_agg_gby_col_partitioned PARTITIONED ON (t) 
> >>> stored by iceberg stored as orc tblproperties ('format-version'='1') as 
> >>> select b,f,sum(b), sum(f),t from all100k group by b,f,v,c,t;
> >>> analyze table mv_agg_gby_col_partitioned compute statistics for columns;
> >>> set hive.explain.user=false;
> >>> explain select b,f,sum(b) from all100k where t=93 group by c,v,f,b;
> !!! match row_contains
>   

[jira] [Work logged] (HIVE-26928) LlapIoImpl::getParquetFooterBuffersFromCache throws exception when metadata cache is disabled

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26928?focusedWorklogId=839776=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839776
 ]

ASF GitHub Bot logged work on HIVE-26928:
-

Author: ASF GitHub Bot
Created on: 17/Jan/23 22:48
Start Date: 17/Jan/23 22:48
Worklog Time Spent: 10m 
  Work Description: sonarcloud[bot] commented on PR #3962:
URL: https://github.com/apache/hive/pull/3962#issuecomment-1386187492

   Kudos, SonarCloud Quality Gate passed!  [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive=3962)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=3962=false=BUG)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=3962=false=BUG)
 [0 
Bugs](https://sonarcloud.io/project/issues?id=apache_hive=3962=false=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=3962=false=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=3962=false=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=3962=false=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3962=false=SECURITY_HOTSPOT)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3962=false=SECURITY_HOTSPOT)
 [0 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3962=false=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive=3962=false=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=3962=false=CODE_SMELL)
 [0 Code 
Smells](https://sonarcloud.io/project/issues?id=apache_hive=3962=false=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive=3962=coverage=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive=3962=duplicated_lines_density=list)
 No Duplication information
   
   




Issue Time Tracking
---

Worklog Id: (was: 839776)
Time Spent: 20m  (was: 10m)

> LlapIoImpl::getParquetFooterBuffersFromCache throws exception when metadata 
> cache is disabled
> -
>
> Key: HIVE-26928
> URL: https://issues.apache.org/jira/browse/HIVE-26928
> Project: Hive
>  Issue Type: Improvement
>  Components: Iceberg integration
>Reporter: Rajesh Balamohan
>Assignee: Simhadri Govindappa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When metadata / LLAP cache is disabled, "iceberg + parquet" throws the 
> following error. "{color:#5a656d}hive.llap.io.memory.mode=none"{color}
> It should check for "metadatacache" correctly or fix it in LlapIoImpl.
>  
> {noformat}
> Caused by: java.lang.NullPointerException: Metadata cache must not be null
>     at 
> com.google.common.base.Preconditions.checkNotNull(Preconditions.java:897)
>     at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapIoImpl.getParquetFooterBuffersFromCache(LlapIoImpl.java:467)
>     at 
> org.apache.iceberg.mr.hive.vector.HiveVectorizedReader.parquetRecordReader(HiveVectorizedReader.java:227)
>     at 
> org.apache.iceberg.mr.hive.vector.HiveVectorizedReader.reader(HiveVectorizedReader.java:162)
>     at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
>     at 
> 

[jira] [Work logged] (HIVE-26924) Alter materialized view enable rewrite throws SemanticException for source iceberg table

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26924?focusedWorklogId=839767=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839767
 ]

ASF GitHub Bot logged work on HIVE-26924:
-

Author: ASF GitHub Bot
Created on: 17/Jan/23 21:39
Start Date: 17/Jan/23 21:39
Worklog Time Spent: 10m 
  Work Description: scarlin-cloudera commented on code in PR #3936:
URL: https://github.com/apache/hive/pull/3936#discussion_r1072857383


##
ql/src/java/org/apache/hadoop/hive/ql/ddl/view/materialized/alter/rewrite/AlterMaterializedViewRewriteAnalyzer.java:
##
@@ -68,10 +68,12 @@ public void analyzeInternal(ASTNode root) throws 
SemanticException {
 Table materializedViewTable = getTable(tableName, true);
 
 // One last test: if we are enabling the rewrite, we need to check that 
query
-// only uses transactional (MM and ACID) tables
+// only uses transactional (MM and ACID and Iceberg) tables
 if (rewriteEnable) {
   for (SourceTable sourceTable : 
materializedViewTable.getMVMetadata().getSourceTables()) {
-if (!AcidUtils.isTransactionalTable(sourceTable.getTable())) {
+Table table = new Table(sourceTable.getTable());
+if (!AcidUtils.isTransactionalTable(sourceTable.getTable()) &&
+!(table.isNonNative() && 
table.getStorageHandler().areSnapshotsSupported())) {

Review Comment:
   Out of curiosity (and I don't know this code at all), what is the reason for 
the "isNonNative()" check?  I guess there's a native table where 
"areSnapshotsSupported()" returns true?  From the name, it sounds like this 
alone should have been enough.





Issue Time Tracking
---

Worklog Id: (was: 839767)
Time Spent: 40m  (was: 0.5h)

> Alter materialized view enable rewrite throws SemanticException for source 
> iceberg table
> 
>
> Key: HIVE-26924
> URL: https://issues.apache.org/jira/browse/HIVE-26924
> Project: Hive
>  Issue Type: Bug
>  Components: Iceberg integration
>Reporter: Dharmik Thakkar
>Assignee: Krisztian Kasa
>Priority: Critical
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> alter materialized view enable rewrite throws SemanticException for source 
> iceberg table
> SQL test
> {code:java}
> >>> create materialized view mv_rewrite as select t, si from all100k where 
> >>> t>115;
> >>> analyze table mv_rewrite compute statistics for columns;
> >>> set hive.explain.user=false;
> >>> explain select si,t from all100k where t>116 and t<120;
> !!! match row_contains
>   alias: iceberg_test_db_hive.mv_rewrite
> >>> alter materialized view mv_rewrite disable rewrite;
> >>> explain select si,t from all100k where t>116 and t<120;
> !!! match row_contains
>   alias: all100k
> >>> alter materialized view mv_rewrite enable rewrite;
> >>> explain select si,t from all100k where t>116 and t<120;
> !!! match row_contains
>   alias: iceberg_test_db_hive.mv_rewrite
> >>> drop materialized view mv_rewrite; {code}
>  
> Error
> {code:java}
> 2023-01-10T18:40:34,303 INFO  [pool-3-thread-1] jdbc.TestDriver: Query: alter 
> materialized view mv_rewrite enable rewrite
> 2023-01-10T18:40:34,365 INFO  [Thread-10] jdbc.TestDriver: INFO  : Compiling 
> command(queryId=hive_20230110184034_f557b4a6-40a0-42ba-8e67-2f273f50af36): 
> alter materialized view mv_rewrite enable rewrite
> 2023-01-10T18:40:34,426 INFO  [Thread-10] jdbc.TestDriver: ERROR : FAILED: 
> SemanticException Automatic rewriting for materialized view cannot be enabled 
> if the materialized view uses non-transactional tables
> 2023-01-10T18:40:34,426 INFO  [Thread-10] jdbc.TestDriver: 
> org.apache.hadoop.hive.ql.parse.SemanticException: Automatic rewriting for 
> materialized view cannot be enabled if the materialized view uses 
> non-transactional tables
> 2023-01-10T18:40:34,426 INFO  [Thread-10] jdbc.TestDriver:      at 
> org.apache.hadoop.hive.ql.ddl.view.materialized.alter.rewrite.AlterMaterializedViewRewriteAnalyzer.analyzeInternal(AlterMaterializedViewRewriteAnalyzer.java:75)
> 2023-01-10T18:40:34,426 INFO  [Thread-10] jdbc.TestDriver:      at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:313)
> 2023-01-10T18:40:34,427 INFO  [Thread-10] jdbc.TestDriver:      at 
> org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:222)
> 2023-01-10T18:40:34,427 INFO  [Thread-10] jdbc.TestDriver:      at 
> org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:105)
> 2023-01-10T18:40:34,427 INFO  [Thread-10] jdbc.TestDriver:      at 
> org.apache.hadoop.hive.ql.Driver.compile(Driver.java:201)
> 2023-01-10T18:40:34,427 INFO  [Thread-10] jdbc.TestDriver:      at 
> 

[jira] [Work logged] (HIVE-22977) Merge delta files instead of running a query in major/minor compaction

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22977?focusedWorklogId=839766=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839766
 ]

ASF GitHub Bot logged work on HIVE-22977:
-

Author: ASF GitHub Bot
Created on: 17/Jan/23 21:36
Start Date: 17/Jan/23 21:36
Worklog Time Spent: 10m 
  Work Description: sonarcloud[bot] commented on PR #3801:
URL: https://github.com/apache/hive/pull/3801#issuecomment-1386083177

   Kudos, SonarCloud Quality Gate passed!  [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive=3801)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=BUG)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=BUG)
 [0 
Bugs](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3801=false=SECURITY_HOTSPOT)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3801=false=SECURITY_HOTSPOT)
 [0 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3801=false=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=CODE_SMELL)
 [2 Code 
Smells](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive=3801=coverage=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive=3801=duplicated_lines_density=list)
 No Duplication information
   
   




Issue Time Tracking
---

Worklog Id: (was: 839766)
Time Spent: 4h 40m  (was: 4.5h)

> Merge delta files instead of running a query in major/minor compaction
> --
>
> Key: HIVE-22977
> URL: https://issues.apache.org/jira/browse/HIVE-22977
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Pintér
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22977.01.patch, HIVE-22977.02.patch
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> [Compaction Optimiziation]
> We should analyse the possibility to move a delta file instead of running a 
> major/minor compaction query.
> Please consider the following use cases:
>  - full acid table but only insert queries were run. This means that no 
> delete delta directories were created. Is it possible to merge the delta 
> directory contents without running a compaction query?
>  - full acid table, initiating queries through the streaming API. If there 
> are no abort transactions during the streaming, is it possible to merge the 
> delta directory contents without running a compaction query?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26922) Deadlock when rebuilding Materialized view stored by Iceberg

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26922?focusedWorklogId=839765=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839765
 ]

ASF GitHub Bot logged work on HIVE-26922:
-

Author: ASF GitHub Bot
Created on: 17/Jan/23 21:31
Start Date: 17/Jan/23 21:31
Worklog Time Spent: 10m 
  Work Description: scarlin-cloudera commented on code in PR #3934:
URL: https://github.com/apache/hive/pull/3934#discussion_r1072847233


##
ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java:
##
@@ -3122,7 +3117,19 @@ Seems much cleaner if each stmt is identified as a 
particular HiveOperation (whi
 }
 return lockComponents;
   }
-  
+
+  private static LockType getLockTypeFromStorageHandler(WriteEntity output, 
Table t) {
+final HiveStorageHandler storageHandler = 
Preconditions.checkNotNull(t.getStorageHandler(),
+"Non-native tables must have an instance of storage handler.");
+LockType lockType = storageHandler.getLockType(output);
+if (null == LockType.findByValue(lockType.getValue())) {
+  throw new IllegalArgumentException(String
+  .format("Lock type [%s] for Database.Table [%s.%s] is unknown", 
lockType, t.getDbName(),

Review Comment:
   Optional nit: I see this was copied from somewhere else, but there really 
only has to be one argument here, t.getCompleteName()





Issue Time Tracking
---

Worklog Id: (was: 839765)
Time Spent: 1h 20m  (was: 1h 10m)

> Deadlock when rebuilding Materialized view stored by Iceberg
> 
>
> Key: HIVE-26922
> URL: https://issues.apache.org/jira/browse/HIVE-26922
> Project: Hive
>  Issue Type: Bug
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> {code}
> create table tbl_ice(a int, b string, c int) stored by iceberg stored as orc 
> tblproperties ('format-version'='1');
> insert into tbl_ice values (1, 'one', 50), (2, 'two', 51), (3, 'three', 52), 
> (4, 'four', 53), (5, 'five', 54);
> create materialized view mat1 stored by iceberg stored as orc tblproperties 
> ('format-version'='1') as
> select tbl_ice.b, tbl_ice.c from tbl_ice where tbl_ice.c > 52;
> insert into tbl_ice values (10, 'ten', 60);
> alter materialized view mat1 rebuild;
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26925) MV with iceberg storage format fails when contains 'PARTITIONED ON' clause due to column number/types difference.

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26925?focusedWorklogId=839741=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839741
 ]

ASF GitHub Bot logged work on HIVE-26925:
-

Author: ASF GitHub Bot
Created on: 17/Jan/23 18:31
Start Date: 17/Jan/23 18:31
Worklog Time Spent: 10m 
  Work Description: sonarcloud[bot] commented on PR #3939:
URL: https://github.com/apache/hive/pull/3939#issuecomment-1385854313

   Kudos, SonarCloud Quality Gate passed!  [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive=3939)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=3939=false=BUG)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=3939=false=BUG)
 [0 
Bugs](https://sonarcloud.io/project/issues?id=apache_hive=3939=false=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=3939=false=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=3939=false=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=3939=false=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3939=false=SECURITY_HOTSPOT)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3939=false=SECURITY_HOTSPOT)
 [0 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3939=false=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive=3939=false=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=3939=false=CODE_SMELL)
 [5 Code 
Smells](https://sonarcloud.io/project/issues?id=apache_hive=3939=false=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive=3939=coverage=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive=3939=duplicated_lines_density=list)
 No Duplication information
   
   




Issue Time Tracking
---

Worklog Id: (was: 839741)
Time Spent: 1h 40m  (was: 1.5h)

> MV with iceberg storage format fails when contains 'PARTITIONED ON' clause 
> due to column number/types difference.
> -
>
> Key: HIVE-26925
> URL: https://issues.apache.org/jira/browse/HIVE-26925
> Project: Hive
>  Issue Type: Bug
>  Components: Iceberg integration
>Reporter: Dharmik Thakkar
>Assignee: Krisztian Kasa
>Priority: Critical
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> MV with iceberg storage format fails when contains 'PARTITIONED ON' clause 
> due to column number/types difference.
> {code:java}
> !!! annotations iceberg
> >>> use iceberg_test_db_hive;
> No rows affected
> >>> set hive.exec.max.dynamic.partitions=2000;
> >>> set hive.exec.max.dynamic.partitions.pernode=2000;
> >>> drop materialized view if exists mv_agg_gby_col_partitioned;
> >>> create materialized view mv_agg_gby_col_partitioned PARTITIONED ON (t) 
> >>> stored by iceberg stored as orc tblproperties ('format-version'='1') as 
> >>> select b,f,sum(b), sum(f),t from all100k group by b,f,v,c,t;
> >>> analyze table mv_agg_gby_col_partitioned compute statistics for columns;
> >>> set hive.explain.user=false;
> >>> explain select b,f,sum(b) from all100k where t=93 group by c,v,f,b;
> !!! match row_contains
>   

[jira] [Work logged] (HIVE-26928) LlapIoImpl::getParquetFooterBuffersFromCache throws exception when metadata cache is disabled

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26928?focusedWorklogId=839735=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839735
 ]

ASF GitHub Bot logged work on HIVE-26928:
-

Author: ASF GitHub Bot
Created on: 17/Jan/23 18:17
Start Date: 17/Jan/23 18:17
Worklog Time Spent: 10m 
  Work Description: simhadri-g opened a new pull request, #3962:
URL: https://github.com/apache/hive/pull/3962

   …tion when metadata cache is disabled
   
   
   
   ### What changes were proposed in this pull request?
   If metadata / LLAP cache is disabled (hive.llap.io.memory.mode=none) at the 
time of initializing the LLAP I/O on daemon startup results in NPE when trying 
to reading "iceberg + parquet"  tables.
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   No
   
   ### How was this patch tested?
   
   1. Q file run using TestIcebergLlapLocalCliDriver.
2. manual test.




Issue Time Tracking
---

Worklog Id: (was: 839735)
Remaining Estimate: 0h
Time Spent: 10m

> LlapIoImpl::getParquetFooterBuffersFromCache throws exception when metadata 
> cache is disabled
> -
>
> Key: HIVE-26928
> URL: https://issues.apache.org/jira/browse/HIVE-26928
> Project: Hive
>  Issue Type: Improvement
>  Components: Iceberg integration
>Reporter: Rajesh Balamohan
>Assignee: Simhadri Govindappa
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When metadata / LLAP cache is disabled, "iceberg + parquet" throws the 
> following error. "{color:#5a656d}hive.llap.io.memory.mode=none"{color}
> It should check for "metadatacache" correctly or fix it in LlapIoImpl.
>  
> {noformat}
> Caused by: java.lang.NullPointerException: Metadata cache must not be null
>     at 
> com.google.common.base.Preconditions.checkNotNull(Preconditions.java:897)
>     at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapIoImpl.getParquetFooterBuffersFromCache(LlapIoImpl.java:467)
>     at 
> org.apache.iceberg.mr.hive.vector.HiveVectorizedReader.parquetRecordReader(HiveVectorizedReader.java:227)
>     at 
> org.apache.iceberg.mr.hive.vector.HiveVectorizedReader.reader(HiveVectorizedReader.java:162)
>     at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
>     at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>     at 
> org.apache.iceberg.common.DynMethods$UnboundMethod.invokeChecked(DynMethods.java:65)
>     at 
> org.apache.iceberg.common.DynMethods$UnboundMethod.invoke(DynMethods.java:77)
>     at 
> org.apache.iceberg.common.DynMethods$StaticMethod.invoke(DynMethods.java:196)
>     at 
> org.apache.iceberg.mr.mapreduce.IcebergInputFormat$IcebergRecordReader.openVectorized(IcebergInputFormat.java:331)
>     at 
> org.apache.iceberg.mr.mapreduce.IcebergInputFormat$IcebergRecordReader.open(IcebergInputFormat.java:377)
>     at 
> org.apache.iceberg.mr.mapreduce.IcebergInputFormat$IcebergRecordReader.nextTask(IcebergInputFormat.java:270)
>     at 
> org.apache.iceberg.mr.mapreduce.IcebergInputFormat$IcebergRecordReader.initialize(IcebergInputFormat.java:266)
>     at 
> org.apache.iceberg.mr.mapred.AbstractMapredIcebergRecordReader.(AbstractMapredIcebergRecordReader.java:40)
>     at 
> org.apache.iceberg.mr.hive.vector.HiveIcebergVectorizedRecordReader.(HiveIcebergVectorizedRecordReader.java:41)
>  {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-26928) LlapIoImpl::getParquetFooterBuffersFromCache throws exception when metadata cache is disabled

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-26928:
--
Labels: pull-request-available  (was: )

> LlapIoImpl::getParquetFooterBuffersFromCache throws exception when metadata 
> cache is disabled
> -
>
> Key: HIVE-26928
> URL: https://issues.apache.org/jira/browse/HIVE-26928
> Project: Hive
>  Issue Type: Improvement
>  Components: Iceberg integration
>Reporter: Rajesh Balamohan
>Assignee: Simhadri Govindappa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When metadata / LLAP cache is disabled, "iceberg + parquet" throws the 
> following error. "{color:#5a656d}hive.llap.io.memory.mode=none"{color}
> It should check for "metadatacache" correctly or fix it in LlapIoImpl.
>  
> {noformat}
> Caused by: java.lang.NullPointerException: Metadata cache must not be null
>     at 
> com.google.common.base.Preconditions.checkNotNull(Preconditions.java:897)
>     at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapIoImpl.getParquetFooterBuffersFromCache(LlapIoImpl.java:467)
>     at 
> org.apache.iceberg.mr.hive.vector.HiveVectorizedReader.parquetRecordReader(HiveVectorizedReader.java:227)
>     at 
> org.apache.iceberg.mr.hive.vector.HiveVectorizedReader.reader(HiveVectorizedReader.java:162)
>     at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
>     at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>     at 
> org.apache.iceberg.common.DynMethods$UnboundMethod.invokeChecked(DynMethods.java:65)
>     at 
> org.apache.iceberg.common.DynMethods$UnboundMethod.invoke(DynMethods.java:77)
>     at 
> org.apache.iceberg.common.DynMethods$StaticMethod.invoke(DynMethods.java:196)
>     at 
> org.apache.iceberg.mr.mapreduce.IcebergInputFormat$IcebergRecordReader.openVectorized(IcebergInputFormat.java:331)
>     at 
> org.apache.iceberg.mr.mapreduce.IcebergInputFormat$IcebergRecordReader.open(IcebergInputFormat.java:377)
>     at 
> org.apache.iceberg.mr.mapreduce.IcebergInputFormat$IcebergRecordReader.nextTask(IcebergInputFormat.java:270)
>     at 
> org.apache.iceberg.mr.mapreduce.IcebergInputFormat$IcebergRecordReader.initialize(IcebergInputFormat.java:266)
>     at 
> org.apache.iceberg.mr.mapred.AbstractMapredIcebergRecordReader.(AbstractMapredIcebergRecordReader.java:40)
>     at 
> org.apache.iceberg.mr.hive.vector.HiveIcebergVectorizedRecordReader.(HiveIcebergVectorizedRecordReader.java:41)
>  {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-22173) Query with multiple lateral views hangs during compilation

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22173?focusedWorklogId=839732=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839732
 ]

ASF GitHub Bot logged work on HIVE-22173:
-

Author: ASF GitHub Bot
Created on: 17/Jan/23 18:11
Start Date: 17/Jan/23 18:11
Worklog Time Spent: 10m 
  Work Description: sonarcloud[bot] commented on PR #3852:
URL: https://github.com/apache/hive/pull/3852#issuecomment-1385831457

   Kudos, SonarCloud Quality Gate passed!  [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive=3852)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=3852=false=BUG)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=3852=false=BUG)
 [0 
Bugs](https://sonarcloud.io/project/issues?id=apache_hive=3852=false=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=3852=false=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=3852=false=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=3852=false=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3852=false=SECURITY_HOTSPOT)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3852=false=SECURITY_HOTSPOT)
 [0 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3852=false=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive=3852=false=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=3852=false=CODE_SMELL)
 [1 Code 
Smell](https://sonarcloud.io/project/issues?id=apache_hive=3852=false=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive=3852=coverage=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive=3852=duplicated_lines_density=list)
 No Duplication information
   
   




Issue Time Tracking
---

Worklog Id: (was: 839732)
Time Spent: 1h 50m  (was: 1h 40m)

> Query with multiple lateral views hangs during compilation
> --
>
> Key: HIVE-22173
> URL: https://issues.apache.org/jira/browse/HIVE-22173
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.1.1, 4.0.0-alpha-1
> Environment: Hive-3.1.1, Java-8
>Reporter: Rajkumar Singh
>Assignee: Stamatis Zampetakis
>Priority: Critical
>  Labels: pull-request-available
> Attachments: op_plan_4_lateral_views.pdf, thread-progress.log
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Steps To Repro:
> {code:java}
> -- create table 
> CREATE EXTERNAL TABLE `jsontable`( 
> `json_string` string) 
> ROW FORMAT SERDE 
> 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' 
> STORED AS INPUTFORMAT 
> 'org.apache.hadoop.mapred.TextInputFormat' 
> OUTPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' ;
> -- Run explain of the query
> explain SELECT
> *
> FROM jsontable
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.addr.city'), "\\[|\\]|\"", ""),',')) t1 as c1
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.addr.country'), "\\[|\\]|\"", ""),',')) t2 as c2
> lateral view 
> 

[jira] [Work logged] (HIVE-26809) Upgrade ORC to 1.8.1

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26809?focusedWorklogId=839726=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839726
 ]

ASF GitHub Bot logged work on HIVE-26809:
-

Author: ASF GitHub Bot
Created on: 17/Jan/23 17:46
Start Date: 17/Jan/23 17:46
Worklog Time Spent: 10m 
  Work Description: difin commented on code in PR #3833:
URL: https://github.com/apache/hive/pull/3833#discussion_r1072298943


##
ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedTreeReaderFactory.java:
##
@@ -224,7 +237,252 @@ private static void skipCompressedIndex(boolean 
isCompressed, PositionProvider i
 index.getNext();
   }
 
-  protected static class StringStreamReader extends StringTreeReader
+  public static class StringDictionaryTreeReaderHive extends TreeReader {

Review Comment:
   Hi @ayushtkn, I agree with you. It is not ideal approach. Before 
implementing this approach I did try to adapt Hive, but I didn't succeed to 
find how Hive could be adapted to ORC-1060 changes because those changes are 
inside internal implementation of Orc StringDictionaryTreeReader class. The API 
of StringDictionaryTreeReader class remained the same.
   
   I agree with you that this approach is not ideal and will backfire in future 
when we try to upgrade and the changes in ORC depends on the ones which we 
ditched, but Hive already heavily depends on internal ORC API by implementing 
its own column readers on top of ORC and when upgrading to different ORC 
version it is often required to make adaptations in Hive.





Issue Time Tracking
---

Worklog Id: (was: 839726)
Time Spent: 5.5h  (was: 5h 20m)

> Upgrade ORC to 1.8.1
> 
>
> Key: HIVE-26809
> URL: https://issues.apache.org/jira/browse/HIVE-26809
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Dmitriy Fingerman
>Assignee: Dmitriy Fingerman
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26809) Upgrade ORC to 1.8.1

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26809?focusedWorklogId=839725=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839725
 ]

ASF GitHub Bot logged work on HIVE-26809:
-

Author: ASF GitHub Bot
Created on: 17/Jan/23 17:44
Start Date: 17/Jan/23 17:44
Worklog Time Spent: 10m 
  Work Description: difin commented on code in PR #3833:
URL: https://github.com/apache/hive/pull/3833#discussion_r1072298943


##
ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedTreeReaderFactory.java:
##
@@ -224,7 +237,252 @@ private static void skipCompressedIndex(boolean 
isCompressed, PositionProvider i
 index.getNext();
   }
 
-  protected static class StringStreamReader extends StringTreeReader
+  public static class StringDictionaryTreeReaderHive extends TreeReader {

Review Comment:
   Hi @ayushtkn, I agree with you. It is not ideal approach. Before 
implementing this approach I did try to adapt Hive, but I didn't succeed to 
find how Hive could be adapted to ORC-1060 changes because those changes are 
inside internal implementation of Orc StringDictionaryTreeReader class. 
   
   I agree with you that this approach is not ideal and will backfire in future 
when we try to upgrade and the changes in ORC depends on the ones which we 
ditched, but Hive already heavily depends on internal ORC API by implementing 
its own column readers on top of ORC and when upgrading to different ORC 
version it is often required to make adaptations in Hive.





Issue Time Tracking
---

Worklog Id: (was: 839725)
Time Spent: 5h 20m  (was: 5h 10m)

> Upgrade ORC to 1.8.1
> 
>
> Key: HIVE-26809
> URL: https://issues.apache.org/jira/browse/HIVE-26809
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Dmitriy Fingerman
>Assignee: Dmitriy Fingerman
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26947) Hive compactor.Worker can respawn connections to HMS at extremely high frequency

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26947?focusedWorklogId=839722=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839722
 ]

ASF GitHub Bot logged work on HIVE-26947:
-

Author: ASF GitHub Bot
Created on: 17/Jan/23 17:29
Start Date: 17/Jan/23 17:29
Worklog Time Spent: 10m 
  Work Description: sonarcloud[bot] commented on PR #3955:
URL: https://github.com/apache/hive/pull/3955#issuecomment-1385777432

   Kudos, SonarCloud Quality Gate passed!  [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive=3955)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=3955=false=BUG)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=3955=false=BUG)
 [0 
Bugs](https://sonarcloud.io/project/issues?id=apache_hive=3955=false=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=3955=false=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=3955=false=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=3955=false=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3955=false=SECURITY_HOTSPOT)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3955=false=SECURITY_HOTSPOT)
 [0 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3955=false=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive=3955=false=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=3955=false=CODE_SMELL)
 [7 Code 
Smells](https://sonarcloud.io/project/issues?id=apache_hive=3955=false=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive=3955=coverage=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive=3955=duplicated_lines_density=list)
 No Duplication information
   
   




Issue Time Tracking
---

Worklog Id: (was: 839722)
Time Spent: 1h 10m  (was: 1h)

> Hive compactor.Worker can respawn connections to HMS at extremely high 
> frequency
> 
>
> Key: HIVE-26947
> URL: https://issues.apache.org/jira/browse/HIVE-26947
> Project: Hive
>  Issue Type: Bug
>Reporter: Akshat Mathur
>Assignee: Akshat Mathur
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> After catching the exception generated by the findNextCompactionAndExecute() 
> task, HS2 appears to immediately rerun the task with no delay or backoff.  As 
> a result there are ~3500 connection attempts from HS2 to HMS over just a 5 
> second period in the HS2 log
> The compactor.Worker should wait between failed attempts and maybe do an 
> exponential backoff.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26717) Query based Rebalance compaction on insert-only tables

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26717?focusedWorklogId=839720=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839720
 ]

ASF GitHub Bot logged work on HIVE-26717:
-

Author: ASF GitHub Bot
Created on: 17/Jan/23 17:24
Start Date: 17/Jan/23 17:24
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3935:
URL: https://github.com/apache/hive/pull/3935#discussion_r1072518934


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java:
##
@@ -342,15 +342,17 @@ public CompactionInfo 
findNextToCompact(FindNextCompactRequest rqst) throws Meta
   public void markCompacted(CompactionInfo info) throws MetaException {
 try {
   Connection dbConn = null;
-  Statement stmt = null;
+  PreparedStatement pstmt = null;

Review Comment:
   use try-with-resources since you refactored this method, it's reported in 
findbugs





Issue Time Tracking
---

Worklog Id: (was: 839720)
Time Spent: 2h 20m  (was: 2h 10m)

> Query based Rebalance compaction on insert-only tables
> --
>
> Key: HIVE-26717
> URL: https://issues.apache.org/jira/browse/HIVE-26717
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: László Végh
>Assignee: László Végh
>Priority: Major
>  Labels: ACID, compaction, pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26896) Update load_static_ptn_into_bucketed_table.q.out file

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26896?focusedWorklogId=839714=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839714
 ]

ASF GitHub Bot logged work on HIVE-26896:
-

Author: ASF GitHub Bot
Created on: 17/Jan/23 17:05
Start Date: 17/Jan/23 17:05
Worklog Time Spent: 10m 
  Work Description: zabetak closed pull request #3901: HIVE-26896 : Test 
fixes for lineage3.q and load_static_ptn_into_bucketed_table.q
URL: https://github.com/apache/hive/pull/3901




Issue Time Tracking
---

Worklog Id: (was: 839714)
Time Spent: 1h 50m  (was: 1h 40m)

> Update load_static_ptn_into_bucketed_table.q.out file
> -
>
> Key: HIVE-26896
> URL: https://issues.apache.org/jira/browse/HIVE-26896
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Aman Raj
>Assignee: Aman Raj
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.2.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> These tests were fixed in branch-3.1 so backporting them to branch-3



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HIVE-26896) Update load_static_ptn_into_bucketed_table.q.out file

2023-01-17 Thread Stamatis Zampetakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis resolved HIVE-26896.

Fix Version/s: 3.2.0
   Resolution: Fixed

Fixed in 
https://github.com/apache/hive/commit/e5573b0e0d30f8c3042239cf0fda219b25fe075d. 
Thanks for the PR [~amanraj2520]!

> Update load_static_ptn_into_bucketed_table.q.out file
> -
>
> Key: HIVE-26896
> URL: https://issues.apache.org/jira/browse/HIVE-26896
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Aman Raj
>Assignee: Aman Raj
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.2.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> These tests were fixed in branch-3.1 so backporting them to branch-3



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-26896) Update load_static_ptn_into_bucketed_table.q.out file

2023-01-17 Thread Stamatis Zampetakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-26896:
---
Summary: Update load_static_ptn_into_bucketed_table.q.out file  (was: 
Backport of Test fixes for lineage3.q and load_static_ptn_into_bucketed_table.q)

> Update load_static_ptn_into_bucketed_table.q.out file
> -
>
> Key: HIVE-26896
> URL: https://issues.apache.org/jira/browse/HIVE-26896
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Aman Raj
>Assignee: Aman Raj
>Priority: Critical
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> These tests were fixed in branch-3.1 so backporting them to branch-3



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-22173) Query with multiple lateral views hangs during compilation

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22173?focusedWorklogId=839710=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839710
 ]

ASF GitHub Bot logged work on HIVE-22173:
-

Author: ASF GitHub Bot
Created on: 17/Jan/23 16:43
Start Date: 17/Jan/23 16:43
Worklog Time Spent: 10m 
  Work Description: zabetak commented on PR #3852:
URL: https://github.com/apache/hive/pull/3852#issuecomment-1385710968

   @amansinha100 I addressed your comments can you please have another look. 
Thanks!




Issue Time Tracking
---

Worklog Id: (was: 839710)
Time Spent: 1h 40m  (was: 1.5h)

> Query with multiple lateral views hangs during compilation
> --
>
> Key: HIVE-22173
> URL: https://issues.apache.org/jira/browse/HIVE-22173
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.1.1, 4.0.0-alpha-1
> Environment: Hive-3.1.1, Java-8
>Reporter: Rajkumar Singh
>Assignee: Stamatis Zampetakis
>Priority: Critical
>  Labels: pull-request-available
> Attachments: op_plan_4_lateral_views.pdf, thread-progress.log
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Steps To Repro:
> {code:java}
> -- create table 
> CREATE EXTERNAL TABLE `jsontable`( 
> `json_string` string) 
> ROW FORMAT SERDE 
> 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' 
> STORED AS INPUTFORMAT 
> 'org.apache.hadoop.mapred.TextInputFormat' 
> OUTPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' ;
> -- Run explain of the query
> explain SELECT
> *
> FROM jsontable
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.addr.city'), "\\[|\\]|\"", ""),',')) t1 as c1
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.addr.country'), "\\[|\\]|\"", ""),',')) t2 as c2
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.addr'), "\\[|\\]|\"", ""),',')) t3 as c3
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.addr.postalCode'), "\\[|\\]|\"", ""),',')) t4 as c4
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.addr.state'), "\\[|\\]|\"", ""),',')) t5 as c5
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.addr.streetAddressLine'), "\\[|\\]|\"", ""),',')) t6 as c6
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.dummyfield'), "\\[|\\]|\"", ""),',')) t7 as c7
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.dummyfield'), "\\[|\\]|\"", ""),',')) t8 as c8
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.dummyfield.name.suffix'), "\\[|\\]|\"", ""),',')) t9 as c9
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.id.extension'), "\\[|\\]|\"", ""),',')) t10 as c10
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.id'), "\\[|\\]|\"", ""),',')) t11 as c11
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.id.root'), "\\[|\\]|\"", ""),',')) t12 as c12
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.telecom.'), "\\[|\\]|\"", ""),',')) t13 as c13
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.dummyfield1.use'), "\\[|\\]|\"", ""),',')) t14 as c14
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.dummyfield1.value'), "\\[|\\]|\"", ""),',')) t15 as c15
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield1.dummyfield1.code'), "\\[|\\]|\"", ""),',')) t16 as c16
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield1.dummyfield1.value'), "\\[|\\]|\"", ""),',')) t17 as c17
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield2.city'), "\\[|\\]|\"", ""),',')) t18 as c18
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield2.city'), "\\[|\\]|\"", ""),',')) t19 as c19
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield2.country'), "\\[|\\]|\"", ""),',')) t20 as c20
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield2.country'), "\\[|\\]|\"", 

[jira] [Work logged] (HIVE-22173) Query with multiple lateral views hangs during compilation

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22173?focusedWorklogId=839708=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839708
 ]

ASF GitHub Bot logged work on HIVE-22173:
-

Author: ASF GitHub Bot
Created on: 17/Jan/23 16:42
Start Date: 17/Jan/23 16:42
Worklog Time Spent: 10m 
  Work Description: zabetak commented on PR #3852:
URL: https://github.com/apache/hive/pull/3852#issuecomment-1385709912

   > Also, the commit message mentions partition pruning but I didn't see 
changes related to that (I might have missed it).
   
   @amansinha100 The partition pruning optimization also relies on the present 
of the synthetic `IN (...)` predicates generated by `SyntheticJoinPredicate` 
transformation thus it is also affected by the changes here. For more details:
   
https://github.com/apache/hive/blob/ad0ab58d9945b9a4727ab606f566e1d346bbd20b/ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java#L91




Issue Time Tracking
---

Worklog Id: (was: 839708)
Time Spent: 1.5h  (was: 1h 20m)

> Query with multiple lateral views hangs during compilation
> --
>
> Key: HIVE-22173
> URL: https://issues.apache.org/jira/browse/HIVE-22173
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.1.1, 4.0.0-alpha-1
> Environment: Hive-3.1.1, Java-8
>Reporter: Rajkumar Singh
>Assignee: Stamatis Zampetakis
>Priority: Critical
>  Labels: pull-request-available
> Attachments: op_plan_4_lateral_views.pdf, thread-progress.log
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Steps To Repro:
> {code:java}
> -- create table 
> CREATE EXTERNAL TABLE `jsontable`( 
> `json_string` string) 
> ROW FORMAT SERDE 
> 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' 
> STORED AS INPUTFORMAT 
> 'org.apache.hadoop.mapred.TextInputFormat' 
> OUTPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' ;
> -- Run explain of the query
> explain SELECT
> *
> FROM jsontable
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.addr.city'), "\\[|\\]|\"", ""),',')) t1 as c1
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.addr.country'), "\\[|\\]|\"", ""),',')) t2 as c2
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.addr'), "\\[|\\]|\"", ""),',')) t3 as c3
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.addr.postalCode'), "\\[|\\]|\"", ""),',')) t4 as c4
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.addr.state'), "\\[|\\]|\"", ""),',')) t5 as c5
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.addr.streetAddressLine'), "\\[|\\]|\"", ""),',')) t6 as c6
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.dummyfield'), "\\[|\\]|\"", ""),',')) t7 as c7
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.dummyfield'), "\\[|\\]|\"", ""),',')) t8 as c8
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.dummyfield.name.suffix'), "\\[|\\]|\"", ""),',')) t9 as c9
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.id.extension'), "\\[|\\]|\"", ""),',')) t10 as c10
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.id'), "\\[|\\]|\"", ""),',')) t11 as c11
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.id.root'), "\\[|\\]|\"", ""),',')) t12 as c12
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.telecom.'), "\\[|\\]|\"", ""),',')) t13 as c13
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.dummyfield1.use'), "\\[|\\]|\"", ""),',')) t14 as c14
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.dummyfield1.value'), "\\[|\\]|\"", ""),',')) t15 as c15
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield1.dummyfield1.code'), "\\[|\\]|\"", ""),',')) t16 as c16
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield1.dummyfield1.value'), "\\[|\\]|\"", ""),',')) t17 as c17
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield2.city'), 

[jira] [Work logged] (HIVE-22173) Query with multiple lateral views hangs during compilation

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22173?focusedWorklogId=839707=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839707
 ]

ASF GitHub Bot logged work on HIVE-22173:
-

Author: ASF GitHub Bot
Created on: 17/Jan/23 16:40
Start Date: 17/Jan/23 16:40
Worklog Time Spent: 10m 
  Work Description: zabetak commented on code in PR #3852:
URL: https://github.com/apache/hive/pull/3852#discussion_r1072444905


##
ql/src/test/results/clientpositive/llap/lvj_mapjoin.q.out:
##
@@ -121,7 +121,6 @@ STAGE PLANS:
 TableScan
   alias: expod1
   filterExpr: aid is not null (type: boolean)
-  probeDecodeDetails: cacheKey:HASH_MAP_MAPJOIN_39_container, 
bigKeyColName:aid, smallTablePos:1, keyRatio:1.0

Review Comment:
   The probe decode optimization relies on the presence of semijoins 
(https://github.com/apache/hive/blob/5f57814ed743a411c8fa7c647c24c98461271fe3/ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java#L1401).
 Semijoins depend on the `SyntheticJoinPredicate` transformation thus probe 
decode depends transitively on `SyntheticJoinPredicate`. 
   
   This PR disables `SyntheticJoinPredicate` transformation for branches with 
lateral views (present in `lvj_mapjoin.q` test) thus semijoins are not 
considered and neither probe decode.





Issue Time Tracking
---

Worklog Id: (was: 839707)
Time Spent: 1h 20m  (was: 1h 10m)

> Query with multiple lateral views hangs during compilation
> --
>
> Key: HIVE-22173
> URL: https://issues.apache.org/jira/browse/HIVE-22173
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.1.1, 4.0.0-alpha-1
> Environment: Hive-3.1.1, Java-8
>Reporter: Rajkumar Singh
>Assignee: Stamatis Zampetakis
>Priority: Critical
>  Labels: pull-request-available
> Attachments: op_plan_4_lateral_views.pdf, thread-progress.log
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Steps To Repro:
> {code:java}
> -- create table 
> CREATE EXTERNAL TABLE `jsontable`( 
> `json_string` string) 
> ROW FORMAT SERDE 
> 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' 
> STORED AS INPUTFORMAT 
> 'org.apache.hadoop.mapred.TextInputFormat' 
> OUTPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' ;
> -- Run explain of the query
> explain SELECT
> *
> FROM jsontable
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.addr.city'), "\\[|\\]|\"", ""),',')) t1 as c1
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.addr.country'), "\\[|\\]|\"", ""),',')) t2 as c2
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.addr'), "\\[|\\]|\"", ""),',')) t3 as c3
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.addr.postalCode'), "\\[|\\]|\"", ""),',')) t4 as c4
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.addr.state'), "\\[|\\]|\"", ""),',')) t5 as c5
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.addr.streetAddressLine'), "\\[|\\]|\"", ""),',')) t6 as c6
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.dummyfield'), "\\[|\\]|\"", ""),',')) t7 as c7
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.dummyfield'), "\\[|\\]|\"", ""),',')) t8 as c8
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.dummyfield.name.suffix'), "\\[|\\]|\"", ""),',')) t9 as c9
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.id.extension'), "\\[|\\]|\"", ""),',')) t10 as c10
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.id'), "\\[|\\]|\"", ""),',')) t11 as c11
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.id.root'), "\\[|\\]|\"", ""),',')) t12 as c12
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.telecom.'), "\\[|\\]|\"", ""),',')) t13 as c13
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.dummyfield1.use'), "\\[|\\]|\"", ""),',')) t14 as c14
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.dummyfield1.value'), "\\[|\\]|\"", ""),',')) t15 as c15
> lateral view 
> 

[jira] [Work logged] (HIVE-22173) Query with multiple lateral views hangs during compilation

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22173?focusedWorklogId=839705=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839705
 ]

ASF GitHub Bot logged work on HIVE-22173:
-

Author: ASF GitHub Bot
Created on: 17/Jan/23 16:39
Start Date: 17/Jan/23 16:39
Worklog Time Spent: 10m 
  Work Description: zabetak commented on code in PR #3852:
URL: https://github.com/apache/hive/pull/3852#discussion_r1072443296


##
common/src/java/org/apache/hadoop/hive/conf/HiveConf.java:
##
@@ -3710,7 +3710,12 @@ public static enum ConfVars {
 HIVE_EXPLAIN_USER("hive.explain.user", true,
 "Whether to show explain result at user level.\n" +
 "When enabled, will log EXPLAIN output for the query at user level. 
Tez only."),
-
+HIVE_EXPLAIN_VISIT_LIMIT("hive.explain.visit.limit", 256, new 
RangeValidator(1, Integer.MAX_VALUE),

Review Comment:
   The limit applies only when doing EXPLAIN thus the choice of this name. 
Adding `node` in the property name is a good idea so I applied this change 
(https://github.com/apache/hive/pull/3852/commits/5c9933e1a59fe6b83638cfa62f9ce887c711).
 I opted to introduce a limit cause it is not possible to address the problem 
at the EXPLAIN level without changing the output format.
   
   There are many places where a graph is traversed in Hive and applying a 
global limit everywhere would be difficult to enforce. Moreover, it would 
possibly require changes in many places leading to a change with much bigger 
impact. 
   
   If we want to go for a global visit limit then maybe it would be better to 
do it as a separate JIRA/PR.





Issue Time Tracking
---

Worklog Id: (was: 839705)
Time Spent: 1h 10m  (was: 1h)

> Query with multiple lateral views hangs during compilation
> --
>
> Key: HIVE-22173
> URL: https://issues.apache.org/jira/browse/HIVE-22173
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.1.1, 4.0.0-alpha-1
> Environment: Hive-3.1.1, Java-8
>Reporter: Rajkumar Singh
>Assignee: Stamatis Zampetakis
>Priority: Critical
>  Labels: pull-request-available
> Attachments: op_plan_4_lateral_views.pdf, thread-progress.log
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Steps To Repro:
> {code:java}
> -- create table 
> CREATE EXTERNAL TABLE `jsontable`( 
> `json_string` string) 
> ROW FORMAT SERDE 
> 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' 
> STORED AS INPUTFORMAT 
> 'org.apache.hadoop.mapred.TextInputFormat' 
> OUTPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' ;
> -- Run explain of the query
> explain SELECT
> *
> FROM jsontable
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.addr.city'), "\\[|\\]|\"", ""),',')) t1 as c1
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.addr.country'), "\\[|\\]|\"", ""),',')) t2 as c2
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.addr'), "\\[|\\]|\"", ""),',')) t3 as c3
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.addr.postalCode'), "\\[|\\]|\"", ""),',')) t4 as c4
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.addr.state'), "\\[|\\]|\"", ""),',')) t5 as c5
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.addr.streetAddressLine'), "\\[|\\]|\"", ""),',')) t6 as c6
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.dummyfield'), "\\[|\\]|\"", ""),',')) t7 as c7
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.dummyfield'), "\\[|\\]|\"", ""),',')) t8 as c8
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.dummyfield.name.suffix'), "\\[|\\]|\"", ""),',')) t9 as c9
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.id.extension'), "\\[|\\]|\"", ""),',')) t10 as c10
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.id'), "\\[|\\]|\"", ""),',')) t11 as c11
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.id.root'), "\\[|\\]|\"", ""),',')) t12 as c12
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.telecom.'), "\\[|\\]|\"", ""),',')) t13 as c13
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> 

[jira] [Work logged] (HIVE-22173) Query with multiple lateral views hangs during compilation

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22173?focusedWorklogId=839704=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839704
 ]

ASF GitHub Bot logged work on HIVE-22173:
-

Author: ASF GitHub Bot
Created on: 17/Jan/23 16:37
Start Date: 17/Jan/23 16:37
Worklog Time Spent: 10m 
  Work Description: zabetak commented on code in PR #3852:
URL: https://github.com/apache/hive/pull/3852#discussion_r1072441255


##
ql/src/java/org/apache/hadoop/hive/ql/ppd/SyntheticJoinPredicate.java:
##
@@ -94,8 +95,8 @@ public ParseContext transform(ParseContext pctx) throws 
SemanticException {
 // rule and passes the context along
 SyntheticContext context = new SyntheticContext(pctx);
 SemanticDispatcher disp = new DefaultRuleDispatcher(null, opRules, 
context);
-SemanticGraphWalker ogw = new PreOrderOnceWalker(disp);
-
+PreOrderOnceWalker ogw = new PreOrderOnceWalker(disp);
+ogw.excludeNode(LateralViewForwardOperator.class);

Review Comment:
   Done 
(https://github.com/apache/hive/pull/3852/commits/e3c882083d7449efdc7c86ddfbf0c5e86e8c8d93)





Issue Time Tracking
---

Worklog Id: (was: 839704)
Time Spent: 1h  (was: 50m)

> Query with multiple lateral views hangs during compilation
> --
>
> Key: HIVE-22173
> URL: https://issues.apache.org/jira/browse/HIVE-22173
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.1.1, 4.0.0-alpha-1
> Environment: Hive-3.1.1, Java-8
>Reporter: Rajkumar Singh
>Assignee: Stamatis Zampetakis
>Priority: Critical
>  Labels: pull-request-available
> Attachments: op_plan_4_lateral_views.pdf, thread-progress.log
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Steps To Repro:
> {code:java}
> -- create table 
> CREATE EXTERNAL TABLE `jsontable`( 
> `json_string` string) 
> ROW FORMAT SERDE 
> 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' 
> STORED AS INPUTFORMAT 
> 'org.apache.hadoop.mapred.TextInputFormat' 
> OUTPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' ;
> -- Run explain of the query
> explain SELECT
> *
> FROM jsontable
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.addr.city'), "\\[|\\]|\"", ""),',')) t1 as c1
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.addr.country'), "\\[|\\]|\"", ""),',')) t2 as c2
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.addr'), "\\[|\\]|\"", ""),',')) t3 as c3
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.addr.postalCode'), "\\[|\\]|\"", ""),',')) t4 as c4
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.addr.state'), "\\[|\\]|\"", ""),',')) t5 as c5
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.addr.streetAddressLine'), "\\[|\\]|\"", ""),',')) t6 as c6
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.dummyfield'), "\\[|\\]|\"", ""),',')) t7 as c7
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.dummyfield'), "\\[|\\]|\"", ""),',')) t8 as c8
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.dummyfield.name.suffix'), "\\[|\\]|\"", ""),',')) t9 as c9
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.id.extension'), "\\[|\\]|\"", ""),',')) t10 as c10
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.id'), "\\[|\\]|\"", ""),',')) t11 as c11
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.id.root'), "\\[|\\]|\"", ""),',')) t12 as c12
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.telecom.'), "\\[|\\]|\"", ""),',')) t13 as c13
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.dummyfield1.use'), "\\[|\\]|\"", ""),',')) t14 as c14
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield.dummyfield1.value'), "\\[|\\]|\"", ""),',')) t15 as c15
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield1.dummyfield1.code'), "\\[|\\]|\"", ""),',')) t16 as c16
> lateral view 
> explode(split(regexp_replace(get_json_object(jsontable.json_string, 
> '$.jsonfield1.dummyfield1.value'), "\\[|\\]|\"", ""),',')) 

[jira] [Work logged] (HIVE-26925) MV with iceberg storage format fails when contains 'PARTITIONED ON' clause due to column number/types difference.

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26925?focusedWorklogId=839703=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839703
 ]

ASF GitHub Bot logged work on HIVE-26925:
-

Author: ASF GitHub Bot
Created on: 17/Jan/23 16:37
Start Date: 17/Jan/23 16:37
Worklog Time Spent: 10m 
  Work Description: kasakrisz commented on code in PR #3939:
URL: https://github.com/apache/hive/pull/3939#discussion_r1072440879


##
iceberg/iceberg-handler/src/test/queries/positive/mv_iceberg_partitioned_orc.q:
##
@@ -0,0 +1,16 @@
+

Issue Time Tracking
---

Worklog Id: (was: 839703)
Time Spent: 1.5h  (was: 1h 20m)

> MV with iceberg storage format fails when contains 'PARTITIONED ON' clause 
> due to column number/types difference.
> -
>
> Key: HIVE-26925
> URL: https://issues.apache.org/jira/browse/HIVE-26925
> Project: Hive
>  Issue Type: Bug
>  Components: Iceberg integration
>Reporter: Dharmik Thakkar
>Assignee: Krisztian Kasa
>Priority: Critical
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> MV with iceberg storage format fails when contains 'PARTITIONED ON' clause 
> due to column number/types difference.
> {code:java}
> !!! annotations iceberg
> >>> use iceberg_test_db_hive;
> No rows affected
> >>> set hive.exec.max.dynamic.partitions=2000;
> >>> set hive.exec.max.dynamic.partitions.pernode=2000;
> >>> drop materialized view if exists mv_agg_gby_col_partitioned;
> >>> create materialized view mv_agg_gby_col_partitioned PARTITIONED ON (t) 
> >>> stored by iceberg stored as orc tblproperties ('format-version'='1') as 
> >>> select b,f,sum(b), sum(f),t from all100k group by b,f,v,c,t;
> >>> analyze table mv_agg_gby_col_partitioned compute statistics for columns;
> >>> set hive.explain.user=false;
> >>> explain select b,f,sum(b) from all100k where t=93 group by c,v,f,b;
> !!! match row_contains
>   alias: iceberg_test_db_hive.mv_agg_gby_col_partitioned
> >>> drop materialized view mv_agg_gby_col_partitioned;
>  {code}
> Error
> {code:java}
> 2023-01-10T20:31:17,514 INFO  [pool-5-thread-1] jdbc.TestDriver: Query: 
> create materialized view mv_agg_gby_col_partitioned PARTITIONED ON (t) stored 
> by iceberg stored as orc tblproperties ('format-version'='1') as select 
> b,f,sum(b), sum(f),t from all100k group by b,f,v,c,t
> 2023-01-10T20:31:18,099 INFO  [Thread-21] jdbc.TestDriver: INFO  : Compiling 
> command(queryId=hive_20230110203117_6c333b6a-1642-40e7-80bc-e78dede47980): 
> create materialized view mv_agg_gby_col_partitioned PARTITIONED ON (t) stored 
> by iceberg stored as orc tblproperties ('format-version'='1') as select 
> b,f,sum(b), sum(f),t from all100k group by b,f,v,c,t
> 2023-01-10T20:31:18,100 INFO  [Thread-21] jdbc.TestDriver: INFO  : No Stats 
> for iceberg_test_db_hive@all100k, Columns: b, c, t, f, v
> 2023-01-10T20:31:18,100 INFO  [Thread-21] jdbc.TestDriver: ERROR : FAILED: 
> SemanticException Line 0:-1 Cannot insert into target table because column 
> number/types are different 'TOK_TMP_FILE': Table insclause-0 has 6 columns, 
> but query has 5 columns.
> 2023-01-10T20:31:18,100 INFO  [Thread-21] jdbc.TestDriver: 
> org.apache.hadoop.hive.ql.parse.SemanticException: Line 0:-1 Cannot insert 
> into target table because column number/types are different 'TOK_TMP_FILE': 
> Table insclause-0 has 6 columns, but query has 5 columns.
> 2023-01-10T20:31:18,100 INFO  [Thread-21] jdbc.TestDriver:     at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genConversionSelectOperator(SemanticAnalyzer.java:8905)
> 2023-01-10T20:31:18,100 INFO  [Thread-21] jdbc.TestDriver:     at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFileSinkPlan(SemanticAnalyzer.java:8114)
> 2023-01-10T20:31:18,100 INFO  [Thread-21] jdbc.TestDriver:     at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:11583)
> 2023-01-10T20:31:18,100 INFO  [Thread-21] jdbc.TestDriver:     at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:11455)
> 2023-01-10T20:31:18,100 INFO  [Thread-21] jdbc.TestDriver:     at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:12424)
> 2023-01-10T20:31:18,100 INFO  [Thread-21] jdbc.TestDriver:     at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:12290)
> 2023-01-10T20:31:18,100 INFO  [Thread-21] jdbc.TestDriver:     at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:13038)
> 2023-01-10T20:31:18,100 INFO  [Thread-21] jdbc.TestDriver:     at 
> 

[jira] [Work logged] (HIVE-26956) Improv find_in_set function

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26956?focusedWorklogId=839691=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839691
 ]

ASF GitHub Bot logged work on HIVE-26956:
-

Author: ASF GitHub Bot
Created on: 17/Jan/23 16:00
Start Date: 17/Jan/23 16:00
Worklog Time Spent: 10m 
  Work Description: sonarcloud[bot] commented on PR #3961:
URL: https://github.com/apache/hive/pull/3961#issuecomment-1385647100

   Kudos, SonarCloud Quality Gate passed!  [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive=3961)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=3961=false=BUG)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=3961=false=BUG)
 [0 
Bugs](https://sonarcloud.io/project/issues?id=apache_hive=3961=false=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=3961=false=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=3961=false=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=3961=false=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3961=false=SECURITY_HOTSPOT)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3961=false=SECURITY_HOTSPOT)
 [0 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3961=false=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive=3961=false=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=3961=false=CODE_SMELL)
 [0 Code 
Smells](https://sonarcloud.io/project/issues?id=apache_hive=3961=false=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive=3961=coverage=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive=3961=duplicated_lines_density=list)
 No Duplication information
   
   




Issue Time Tracking
---

Worklog Id: (was: 839691)
Time Spent: 20m  (was: 10m)

> Improv find_in_set function
> ---
>
> Key: HIVE-26956
> URL: https://issues.apache.org/jira/browse/HIVE-26956
> Project: Hive
>  Issue Type: Improvement
>Reporter: Bingye Chen
>Assignee: Bingye Chen
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Improv find_in_set function



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26925) MV with iceberg storage format fails when contains 'PARTITIONED ON' clause due to column number/types difference.

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26925?focusedWorklogId=839679=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839679
 ]

ASF GitHub Bot logged work on HIVE-26925:
-

Author: ASF GitHub Bot
Created on: 17/Jan/23 15:16
Start Date: 17/Jan/23 15:16
Worklog Time Spent: 10m 
  Work Description: amansinha100 commented on code in PR #3939:
URL: https://github.com/apache/hive/pull/3939#discussion_r1072338247


##
iceberg/iceberg-handler/src/test/queries/positive/mv_iceberg_partitioned_orc.q:
##
@@ -0,0 +1,16 @@
+

Issue Time Tracking
---

Worklog Id: (was: 839679)
Time Spent: 1h 20m  (was: 1h 10m)

> MV with iceberg storage format fails when contains 'PARTITIONED ON' clause 
> due to column number/types difference.
> -
>
> Key: HIVE-26925
> URL: https://issues.apache.org/jira/browse/HIVE-26925
> Project: Hive
>  Issue Type: Bug
>  Components: Iceberg integration
>Reporter: Dharmik Thakkar
>Assignee: Krisztian Kasa
>Priority: Critical
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> MV with iceberg storage format fails when contains 'PARTITIONED ON' clause 
> due to column number/types difference.
> {code:java}
> !!! annotations iceberg
> >>> use iceberg_test_db_hive;
> No rows affected
> >>> set hive.exec.max.dynamic.partitions=2000;
> >>> set hive.exec.max.dynamic.partitions.pernode=2000;
> >>> drop materialized view if exists mv_agg_gby_col_partitioned;
> >>> create materialized view mv_agg_gby_col_partitioned PARTITIONED ON (t) 
> >>> stored by iceberg stored as orc tblproperties ('format-version'='1') as 
> >>> select b,f,sum(b), sum(f),t from all100k group by b,f,v,c,t;
> >>> analyze table mv_agg_gby_col_partitioned compute statistics for columns;
> >>> set hive.explain.user=false;
> >>> explain select b,f,sum(b) from all100k where t=93 group by c,v,f,b;
> !!! match row_contains
>   alias: iceberg_test_db_hive.mv_agg_gby_col_partitioned
> >>> drop materialized view mv_agg_gby_col_partitioned;
>  {code}
> Error
> {code:java}
> 2023-01-10T20:31:17,514 INFO  [pool-5-thread-1] jdbc.TestDriver: Query: 
> create materialized view mv_agg_gby_col_partitioned PARTITIONED ON (t) stored 
> by iceberg stored as orc tblproperties ('format-version'='1') as select 
> b,f,sum(b), sum(f),t from all100k group by b,f,v,c,t
> 2023-01-10T20:31:18,099 INFO  [Thread-21] jdbc.TestDriver: INFO  : Compiling 
> command(queryId=hive_20230110203117_6c333b6a-1642-40e7-80bc-e78dede47980): 
> create materialized view mv_agg_gby_col_partitioned PARTITIONED ON (t) stored 
> by iceberg stored as orc tblproperties ('format-version'='1') as select 
> b,f,sum(b), sum(f),t from all100k group by b,f,v,c,t
> 2023-01-10T20:31:18,100 INFO  [Thread-21] jdbc.TestDriver: INFO  : No Stats 
> for iceberg_test_db_hive@all100k, Columns: b, c, t, f, v
> 2023-01-10T20:31:18,100 INFO  [Thread-21] jdbc.TestDriver: ERROR : FAILED: 
> SemanticException Line 0:-1 Cannot insert into target table because column 
> number/types are different 'TOK_TMP_FILE': Table insclause-0 has 6 columns, 
> but query has 5 columns.
> 2023-01-10T20:31:18,100 INFO  [Thread-21] jdbc.TestDriver: 
> org.apache.hadoop.hive.ql.parse.SemanticException: Line 0:-1 Cannot insert 
> into target table because column number/types are different 'TOK_TMP_FILE': 
> Table insclause-0 has 6 columns, but query has 5 columns.
> 2023-01-10T20:31:18,100 INFO  [Thread-21] jdbc.TestDriver:     at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genConversionSelectOperator(SemanticAnalyzer.java:8905)
> 2023-01-10T20:31:18,100 INFO  [Thread-21] jdbc.TestDriver:     at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFileSinkPlan(SemanticAnalyzer.java:8114)
> 2023-01-10T20:31:18,100 INFO  [Thread-21] jdbc.TestDriver:     at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:11583)
> 2023-01-10T20:31:18,100 INFO  [Thread-21] jdbc.TestDriver:     at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:11455)
> 2023-01-10T20:31:18,100 INFO  [Thread-21] jdbc.TestDriver:     at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:12424)
> 2023-01-10T20:31:18,100 INFO  [Thread-21] jdbc.TestDriver:     at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:12290)
> 2023-01-10T20:31:18,100 INFO  [Thread-21] jdbc.TestDriver:     at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:13038)
> 2023-01-10T20:31:18,100 INFO  [Thread-21] jdbc.TestDriver:     at 
> 

[jira] [Work logged] (HIVE-26809) Upgrade ORC to 1.8.1

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26809?focusedWorklogId=839669=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839669
 ]

ASF GitHub Bot logged work on HIVE-26809:
-

Author: ASF GitHub Bot
Created on: 17/Jan/23 14:45
Start Date: 17/Jan/23 14:45
Worklog Time Spent: 10m 
  Work Description: difin commented on code in PR #3833:
URL: https://github.com/apache/hive/pull/3833#discussion_r1072298943


##
ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedTreeReaderFactory.java:
##
@@ -224,7 +237,252 @@ private static void skipCompressedIndex(boolean 
isCompressed, PositionProvider i
 index.getNext();
   }
 
-  protected static class StringStreamReader extends StringTreeReader
+  public static class StringDictionaryTreeReaderHive extends TreeReader {

Review Comment:
   Hi @ayushtkn, I agree with you. It is not ideal approach. Before 
implementing this approach I did try to adapt Hive, but I didn't succeed to 
find how Hive could be adapted to ORC-1060 changes because those changes are 
only inside internal implementation of Orc StringDictionaryTreeReader class. 
   
   I agree with you that this approach is not ideal and will backfire in future 
when we try to upgrade and the changes in ORC depends on the ones which we 
ditched, but Hive already heavily depends on internal ORC API by implementing 
its own column readers on top of ORC and when upgrading to different ORC 
version it is often required to make adaptations in Hive.





Issue Time Tracking
---

Worklog Id: (was: 839669)
Time Spent: 5h 10m  (was: 5h)

> Upgrade ORC to 1.8.1
> 
>
> Key: HIVE-26809
> URL: https://issues.apache.org/jira/browse/HIVE-26809
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Dmitriy Fingerman
>Assignee: Dmitriy Fingerman
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26717) Query based Rebalance compaction on insert-only tables

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26717?focusedWorklogId=839663=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839663
 ]

ASF GitHub Bot logged work on HIVE-26717:
-

Author: ASF GitHub Bot
Created on: 17/Jan/23 14:28
Start Date: 17/Jan/23 14:28
Worklog Time Spent: 10m 
  Work Description: sonarcloud[bot] commented on PR #3935:
URL: https://github.com/apache/hive/pull/3935#issuecomment-1385511656

   Kudos, SonarCloud Quality Gate passed!  [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive=3935)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=3935=false=BUG)
 
[![E](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/E-16px.png
 
'E')](https://sonarcloud.io/project/issues?id=apache_hive=3935=false=BUG)
 [1 
Bug](https://sonarcloud.io/project/issues?id=apache_hive=3935=false=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=3935=false=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=3935=false=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=3935=false=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3935=false=SECURITY_HOTSPOT)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3935=false=SECURITY_HOTSPOT)
 [0 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3935=false=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive=3935=false=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=3935=false=CODE_SMELL)
 [5 Code 
Smells](https://sonarcloud.io/project/issues?id=apache_hive=3935=false=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive=3935=coverage=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive=3935=duplicated_lines_density=list)
 No Duplication information
   
   




Issue Time Tracking
---

Worklog Id: (was: 839663)
Time Spent: 2h 10m  (was: 2h)

> Query based Rebalance compaction on insert-only tables
> --
>
> Key: HIVE-26717
> URL: https://issues.apache.org/jira/browse/HIVE-26717
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: László Végh
>Assignee: László Végh
>Priority: Major
>  Labels: ACID, compaction, pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-26957) Add convertCharset(s, from, to) function

2023-01-17 Thread Bingye Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bingye Chen reassigned HIVE-26957:
--


> Add convertCharset(s, from, to) function
> 
>
> Key: HIVE-26957
> URL: https://issues.apache.org/jira/browse/HIVE-26957
> Project: Hive
>  Issue Type: New Feature
>Reporter: Bingye Chen
>Assignee: Bingye Chen
>Priority: Minor
>
> Add convertCharset(s, from, to) function.
> The function converts the string `s` from the `from` charset to the `to` 
> charset.It is already implemented in clickhouse.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26956) Improv find_in_set function

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26956?focusedWorklogId=839630=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839630
 ]

ASF GitHub Bot logged work on HIVE-26956:
-

Author: ASF GitHub Bot
Created on: 17/Jan/23 13:30
Start Date: 17/Jan/23 13:30
Worklog Time Spent: 10m 
  Work Description: TaoZex opened a new pull request, #3961:
URL: https://github.com/apache/hive/pull/3961

   
   
   ### What changes were proposed in this pull request?
   
   
   Improv find_in_set function
   
   ### Why are the changes needed?
   
   
   Code redundancy
   
   ### Does this PR introduce _any_ user-facing change?
   
   no
   
   ### How was this patch tested?
   
   




Issue Time Tracking
---

Worklog Id: (was: 839630)
Remaining Estimate: 0h
Time Spent: 10m

> Improv find_in_set function
> ---
>
> Key: HIVE-26956
> URL: https://issues.apache.org/jira/browse/HIVE-26956
> Project: Hive
>  Issue Type: Improvement
>Reporter: Bingye Chen
>Assignee: Bingye Chen
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Improv find_in_set function



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26933) Cleanup dump directory for eventId which was failed in previous dump cycle

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26933?focusedWorklogId=839631=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839631
 ]

ASF GitHub Bot logged work on HIVE-26933:
-

Author: ASF GitHub Bot
Created on: 17/Jan/23 13:30
Start Date: 17/Jan/23 13:30
Worklog Time Spent: 10m 
  Work Description: sonarcloud[bot] commented on PR #3960:
URL: https://github.com/apache/hive/pull/3960#issuecomment-1385428005

   Kudos, SonarCloud Quality Gate passed!  [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive=3960)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=3960=false=BUG)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=3960=false=BUG)
 [0 
Bugs](https://sonarcloud.io/project/issues?id=apache_hive=3960=false=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=3960=false=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=3960=false=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=3960=false=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3960=false=SECURITY_HOTSPOT)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3960=false=SECURITY_HOTSPOT)
 [0 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3960=false=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive=3960=false=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=3960=false=CODE_SMELL)
 [0 Code 
Smells](https://sonarcloud.io/project/issues?id=apache_hive=3960=false=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive=3960=coverage=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive=3960=duplicated_lines_density=list)
 No Duplication information
   
   




Issue Time Tracking
---

Worklog Id: (was: 839631)
Time Spent: 20m  (was: 10m)

> Cleanup dump directory for eventId which was failed in previous dump cycle
> --
>
> Key: HIVE-26933
> URL: https://issues.apache.org/jira/browse/HIVE-26933
> Project: Hive
>  Issue Type: Improvement
>Reporter: Harshal Patel
>Assignee: Harshal Patel
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> # If Incremental Dump operation failes while dumping any event id  in the 
> staging directory. Then dump directory for this event id along with file 
> _dumpmetadata  still exists in the dump location. which is getting stored in 
> _events_dump file
>  # When user triggers dump operation for this policy again, It again resumes 
> dumping from failed event id, and tries to dump it again but as that event id 
> directory already created in previous cycle, it fails with the exception
> {noformat}
> [Scheduled Query Executor(schedule:repl_policytest7, execution_id:7181)]: 
> FAILED: Execution Error, return code 4 from 
> org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask. 
> org.apache.hadoop.fs.FileAlreadyExistsException: 
> /warehouse/tablespace/staging/policytest7/dGVzdDc=/14bcf976-662b-4237-b5bb-e7d63a1d089f/hive/137961/_dumpmetadata
>  for client 172.27.182.5 already exists
>     at 
> 

[jira] [Updated] (HIVE-26956) Improv find_in_set function

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-26956:
--
Labels: pull-request-available  (was: )

> Improv find_in_set function
> ---
>
> Key: HIVE-26956
> URL: https://issues.apache.org/jira/browse/HIVE-26956
> Project: Hive
>  Issue Type: Improvement
>Reporter: Bingye Chen
>Assignee: Bingye Chen
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Improv find_in_set function



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-26956) Improv find_in_set function

2023-01-17 Thread Bingye Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bingye Chen reassigned HIVE-26956:
--


> Improv find_in_set function
> ---
>
> Key: HIVE-26956
> URL: https://issues.apache.org/jira/browse/HIVE-26956
> Project: Hive
>  Issue Type: Improvement
>Reporter: Bingye Chen
>Assignee: Bingye Chen
>Priority: Minor
>
> Improv find_in_set function



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26802) Create qtest running QB compaction queries

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26802?focusedWorklogId=839596=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839596
 ]

ASF GitHub Bot logged work on HIVE-26802:
-

Author: ASF GitHub Bot
Created on: 17/Jan/23 12:12
Start Date: 17/Jan/23 12:12
Worklog Time Spent: 10m 
  Work Description: sonarcloud[bot] commented on PR #3882:
URL: https://github.com/apache/hive/pull/3882#issuecomment-1385333728

   Kudos, SonarCloud Quality Gate passed!  [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive=3882)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=3882=false=BUG)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=3882=false=BUG)
 [0 
Bugs](https://sonarcloud.io/project/issues?id=apache_hive=3882=false=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=3882=false=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=3882=false=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=3882=false=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3882=false=SECURITY_HOTSPOT)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3882=false=SECURITY_HOTSPOT)
 [0 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3882=false=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive=3882=false=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=3882=false=CODE_SMELL)
 [4 Code 
Smells](https://sonarcloud.io/project/issues?id=apache_hive=3882=false=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive=3882=coverage=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive=3882=duplicated_lines_density=list)
 No Duplication information
   
   




Issue Time Tracking
---

Worklog Id: (was: 839596)
Time Spent: 4h 50m  (was: 4h 40m)

> Create qtest running QB compaction queries
> --
>
> Key: HIVE-26802
> URL: https://issues.apache.org/jira/browse/HIVE-26802
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltán Rátkai
>Assignee: Zoltán Rátkai
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> Create a qtest that runs the queries that query-based compaction runs.
> Not so much to check for correct data but more to check the query plans, to 
> simplify tracing changes in compilation that might affect QB compaction.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26804) Cancel Compactions in initiated state

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26804?focusedWorklogId=839589=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839589
 ]

ASF GitHub Bot logged work on HIVE-26804:
-

Author: ASF GitHub Bot
Created on: 17/Jan/23 11:25
Start Date: 17/Jan/23 11:25
Worklog Time Spent: 10m 
  Work Description: veghlaci05 commented on code in PR #3880:
URL: https://github.com/apache/hive/pull/3880#discussion_r1072059615


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:
##
@@ -6242,4 +6245,91 @@ public boolean isWrapperFor(Class iface) throws 
SQLException {
 }
   }
 
+  @Override
+  @RetrySemantics.SafeToRetry
+  public AbortCompactResponse abortCompactions(AbortCompactionRequest reqst) 
throws MetaException, NoSuchCompactionException {
+
+AbortCompactResponse response = new AbortCompactResponse(new HashMap<>());
+response.setAbortedcompacts(abortCompactionResponseElements);
+List compactionIdsToAbort = reqst.getCompactionIds();
+if (compactionIdsToAbort.isEmpty()) {
+  LOG.info("Compaction ids are missing in request. No compactions to 
abort");
+  throw new NoSuchCompactionException("Compaction ids missing in request. 
No compactions to abort");
+}
+reqst.getCompactionIds().forEach(x -> {
+  abortCompactionResponseElements.put(x, new 
AbortCompactionResponseElement(x, "Error", "Not Eligible"));
+});
+List eligibleCompactionsToAbort = 
findEligibleCompactionsToAbort(compactionIdsToAbort);
+for (int x = 0; x < eligibleCompactionsToAbort.size(); x++) {
+  abortCompaction(eligibleCompactionsToAbort.get(x));
+}
+return response;
+  }
+
+  private void addAbortCompactionResponse(long id, String message, String 
status) {
+abortCompactionResponseElements.put(id, new 
AbortCompactionResponseElement(id, status, message));
+  }
+
+  @RetrySemantics.SafeToRetry
+  public void abortCompaction(CompactionInfo compactionInfo) throws 
MetaException {
+try {
+  try (Connection dbConn = 
getDbConn(Connection.TRANSACTION_READ_COMMITTED, connPoolMutex);
+   PreparedStatement pStmt = 
dbConn.prepareStatement(TxnQueries.INSERT_INTO_COMPLETED_COMPACTION)) {
+CompactionInfo.insertIntoCompletedCompactions(pStmt, compactionInfo, 
getDbTime(dbConn));
+int updCount = pStmt.executeUpdate();
+if (updCount != 1) {
+  LOG.error("Unable to update compaction record: {}. updCnt={}", 
compactionInfo, updCount);
+  dbConn.rollback();

Review Comment:
   addAbortCompactionResponse() should be called here as well, stating that the 
compaction request could not be idnetified.



##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:
##
@@ -6242,4 +6245,91 @@ public boolean isWrapperFor(Class iface) throws 
SQLException {
 }
   }
 
+  @Override
+  @RetrySemantics.SafeToRetry
+  public AbortCompactResponse abortCompactions(AbortCompactionRequest reqst) 
throws MetaException, NoSuchCompactionException {
+
+AbortCompactResponse response = new AbortCompactResponse(new HashMap<>());
+response.setAbortedcompacts(abortCompactionResponseElements);
+List compactionIdsToAbort = reqst.getCompactionIds();
+if (compactionIdsToAbort.isEmpty()) {
+  LOG.info("Compaction ids are missing in request. No compactions to 
abort");
+  throw new NoSuchCompactionException("Compaction ids missing in request. 
No compactions to abort");
+}
+reqst.getCompactionIds().forEach(x -> {
+  abortCompactionResponseElements.put(x, new 
AbortCompactionResponseElement(x, "Error", "Not Eligible"));
+});
+List eligibleCompactionsToAbort = 
findEligibleCompactionsToAbort(compactionIdsToAbort);
+for (int x = 0; x < eligibleCompactionsToAbort.size(); x++) {
+  abortCompaction(eligibleCompactionsToAbort.get(x));
+}
+return response;
+  }
+
+  private void addAbortCompactionResponse(long id, String message, String 
status) {
+abortCompactionResponseElements.put(id, new 
AbortCompactionResponseElement(id, status, message));
+  }
+
+  @RetrySemantics.SafeToRetry
+  public void abortCompaction(CompactionInfo compactionInfo) throws 
MetaException {
+try {
+  try (Connection dbConn = 
getDbConn(Connection.TRANSACTION_READ_COMMITTED, connPoolMutex);
+   PreparedStatement pStmt = 
dbConn.prepareStatement(TxnQueries.INSERT_INTO_COMPLETED_COMPACTION)) {
+CompactionInfo.insertIntoCompletedCompactions(pStmt, compactionInfo, 
getDbTime(dbConn));
+int updCount = pStmt.executeUpdate();
+if (updCount != 1) {
+  LOG.error("Unable to update compaction record: {}. updCnt={}", 
compactionInfo, updCount);
+  dbConn.rollback();
+} else {
+  LOG.debug("Inserted {} entries into 

[jira] [Work logged] (HIVE-26952) set the value of metastore.storage.schema.reader.impl
 to org.apache.hadoop.hive.metastore.SerDeStorageSchemaReader as default

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26952?focusedWorklogId=839581=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839581
 ]

ASF GitHub Bot logged work on HIVE-26952:
-

Author: ASF GitHub Bot
Created on: 17/Jan/23 11:07
Start Date: 17/Jan/23 11:07
Worklog Time Spent: 10m 
  Work Description: sonarcloud[bot] commented on PR #3959:
URL: https://github.com/apache/hive/pull/3959#issuecomment-1385259486

   Kudos, SonarCloud Quality Gate passed!  [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive=3959)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=3959=false=BUG)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=3959=false=BUG)
 [0 
Bugs](https://sonarcloud.io/project/issues?id=apache_hive=3959=false=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=3959=false=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=3959=false=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=3959=false=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3959=false=SECURITY_HOTSPOT)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3959=false=SECURITY_HOTSPOT)
 [0 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3959=false=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive=3959=false=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=3959=false=CODE_SMELL)
 [0 Code 
Smells](https://sonarcloud.io/project/issues?id=apache_hive=3959=false=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive=3959=coverage=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive=3959=duplicated_lines_density=list)
 No Duplication information
   
   




Issue Time Tracking
---

Worklog Id: (was: 839581)
Time Spent: 20m  (was: 10m)

> set the value of metastore.storage.schema.reader.impl
 to 
> org.apache.hadoop.hive.metastore.SerDeStorageSchemaReader as default
> --
>
> Key: HIVE-26952
> URL: https://issues.apache.org/jira/browse/HIVE-26952
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Reporter: Taraka Rama Rao Lethavadla
>Assignee: Taraka Rama Rao Lethavadla
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> With the default value of
>  
> {code:java}
> DefaultStorageSchemaReader.class.getName(){code}
>  
> in the Metastore Config, *metastore.storage.schema.reader.impl*
> below exception is thrown when trying to read Avro schema
> {noformat}
> Caused by: org.apache.hive.service.cli.HiveSQLException: MetaException 
> (message:java.lang.UnsupportedOperationException: Storage schema reading not 
> supported)
>     at 
> org.apache.hive.service.cli.operation.GetColumnsOperation.runInternal(GetColumnsOperation.java:213)
>     at org.apache.hive.service.cli.operation.Operation.run(Operation.java:247)
>     at 
> org.apache.hive.service.cli.session.HiveSessionImpl.getColumns(HiveSessionImpl.java:729)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at 
> 

[jira] [Work logged] (HIVE-26804) Cancel Compactions in initiated state

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26804?focusedWorklogId=839570=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839570
 ]

ASF GitHub Bot logged work on HIVE-26804:
-

Author: ASF GitHub Bot
Created on: 17/Jan/23 10:19
Start Date: 17/Jan/23 10:19
Worklog Time Spent: 10m 
  Work Description: rkirtir commented on code in PR #3880:
URL: https://github.com/apache/hive/pull/3880#discussion_r1072024074


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:
##
@@ -6242,4 +6246,86 @@ public boolean isWrapperFor(Class iface) throws 
SQLException {
 }
   }
 
+  @Override
+  @RetrySemantics.SafeToRetry
+  public AbortCompactResponse abortCompactions(AbortCompactionRequest reqst) 
throws MetaException, NoSuchCompactionException {

Review Comment:
   As compaction related other methods are in TxnHandler, I had put it in 
TxnHandler. Please suggest



##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:
##
@@ -6242,4 +6246,86 @@ public boolean isWrapperFor(Class iface) throws 
SQLException {
 }
   }
 
+  @Override
+  @RetrySemantics.SafeToRetry
+  public AbortCompactResponse abortCompactions(AbortCompactionRequest reqst) 
throws MetaException, NoSuchCompactionException {
+AbortCompactResponse response = new AbortCompactResponse(new 
ArrayList<>());
+List requestedCompId = reqst.getCompactionIds();
+if (requestedCompId.isEmpty()) {
+  LOG.info("Compaction ids missing in request. No compactions to abort");
+  throw new NoSuchCompactionException("ompaction ids missing in request. 
No compactions to abort");
+}
+List abortCompactionResponseElementList = 
new ArrayList<>();
+for (int i = 0; i < requestedCompId.size(); i++) {
+  AbortCompactionResponseElement responseEle = 
abortCompaction(requestedCompId.get(i));
+  abortCompactionResponseElementList.add(responseEle);
+}
+response.setAbortedcompacts(abortCompactionResponseElementList);
+return response;
+  }
+
+  @RetrySemantics.SafeToRetry
+  public AbortCompactionResponseElement abortCompaction(Long compId) throws 
MetaException {
+try {
+  AbortCompactionResponseElement responseEle = new 
AbortCompactionResponseElement();
+  responseEle.setCompactionIds(compId);
+  try (Connection dbConn = 
getDbConn(Connection.TRANSACTION_READ_COMMITTED, connPoolMutex)) {
+Optional compactionInfo = 
getCompactionByCompId(dbConn, compId);
+if (compactionInfo.isPresent()) {
+  try (PreparedStatement pStmt = 
dbConn.prepareStatement(TxnQueries.INSERT_INTO_COMPLETED_COMPACTION)) {
+CompactionInfo ci = compactionInfo.get();
+ci.errorMessage = "Compaction aborted by user";
+ci.state = TxnStore.ABORTED_STATE;
+CompactionInfo.insertIntoCompletedCompactions(pStmt, ci, 
getDbTime(dbConn));
+int updCount = pStmt.executeUpdate();
+if (updCount != 1) {
+  LOG.error("Unable to update compaction record: {}. updCnt={}", 
ci, updCount);

Review Comment:
   fixed





Issue Time Tracking
---

Worklog Id: (was: 839570)
Time Spent: 1.5h  (was: 1h 20m)

> Cancel Compactions in initiated state
> -
>
> Key: HIVE-26804
> URL: https://issues.apache.org/jira/browse/HIVE-26804
> Project: Hive
>  Issue Type: New Feature
>  Components: Hive
>Reporter: KIRTI RUGE
>Assignee: KIRTI RUGE
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26804) Cancel Compactions in initiated state

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26804?focusedWorklogId=839568=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839568
 ]

ASF GitHub Bot logged work on HIVE-26804:
-

Author: ASF GitHub Bot
Created on: 17/Jan/23 10:18
Start Date: 17/Jan/23 10:18
Worklog Time Spent: 10m 
  Work Description: rkirtir commented on code in PR #3880:
URL: https://github.com/apache/hive/pull/3880#discussion_r1072022879


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:
##
@@ -6242,4 +6246,86 @@ public boolean isWrapperFor(Class iface) throws 
SQLException {
 }
   }
 
+  @Override
+  @RetrySemantics.SafeToRetry
+  public AbortCompactResponse abortCompactions(AbortCompactionRequest reqst) 
throws MetaException, NoSuchCompactionException {
+AbortCompactResponse response = new AbortCompactResponse(new 
ArrayList<>());
+List requestedCompId = reqst.getCompactionIds();
+if (requestedCompId.isEmpty()) {
+  LOG.info("Compaction ids missing in request. No compactions to abort");
+  throw new NoSuchCompactionException("ompaction ids missing in request. 
No compactions to abort");
+}
+List abortCompactionResponseElementList = 
new ArrayList<>();
+for (int i = 0; i < requestedCompId.size(); i++) {
+  AbortCompactionResponseElement responseEle = 
abortCompaction(requestedCompId.get(i));
+  abortCompactionResponseElementList.add(responseEle);
+}
+response.setAbortedcompacts(abortCompactionResponseElementList);
+return response;
+  }
+
+  @RetrySemantics.SafeToRetry
+  public AbortCompactionResponseElement abortCompaction(Long compId) throws 
MetaException {
+try {
+  AbortCompactionResponseElement responseEle = new 
AbortCompactionResponseElement();
+  responseEle.setCompactionIds(compId);
+  try (Connection dbConn = 
getDbConn(Connection.TRANSACTION_READ_COMMITTED, connPoolMutex)) {
+Optional compactionInfo = 
getCompactionByCompId(dbConn, compId);
+if (compactionInfo.isPresent()) {
+  try (PreparedStatement pStmt = 
dbConn.prepareStatement(TxnQueries.INSERT_INTO_COMPLETED_COMPACTION)) {
+CompactionInfo ci = compactionInfo.get();
+ci.errorMessage = "Compaction aborted by user";
+ci.state = TxnStore.ABORTED_STATE;
+CompactionInfo.insertIntoCompletedCompactions(pStmt, ci, 
getDbTime(dbConn));
+int updCount = pStmt.executeUpdate();
+if (updCount != 1) {
+  LOG.error("Unable to update compaction record: {}. updCnt={}", 
ci, updCount);
+  dbConn.rollback();
+}
+LOG.debug("Inserted {} entries into COMPLETED_COMPACTIONS", 
updCount);
+try (PreparedStatement stmt = dbConn.prepareStatement("DELETE FROM 
\"COMPACTION_QUEUE\" WHERE \"CQ_ID\" = ?")) {
+  stmt.setLong(1, ci.id);
+  LOG.debug("Going to execute update on COMPACTION_QUEUE <{}>");
+  updCount = stmt.executeUpdate();
+  if (updCount != 1) {
+LOG.error("Unable to update compaction record: {}. updCnt={}", 
ci, updCount);
+dbConn.rollback();
+  } else {
+responseEle.setMessage("Successfully Aborted Compaction ");
+responseEle.setStatus("Success");
+dbConn.commit();
+  }
+}
+  }
+} else {
+  responseEle.setMessage("Compaction element not eligible for 
cancellation");
+  responseEle.setStatus("Error");
+}
+  } catch (SQLException e) {
+LOG.error("Failed to abort compaction request");
+checkRetryable(e, "abortCompaction(" + compId + ")");
+responseEle.setMessage("Error while aborting compaction");
+responseEle.setStatus("Error");
+  }
+  return responseEle;
+} catch (RetryException e) {
+  return abortCompaction(compId);
+}
+
+  }
+
+  private Optional getCompactionByCompId(Connection dbConn, 
Long compId) throws SQLException, MetaException {

Review Comment:
   fixed.





Issue Time Tracking
---

Worklog Id: (was: 839568)
Time Spent: 1h 10m  (was: 1h)

> Cancel Compactions in initiated state
> -
>
> Key: HIVE-26804
> URL: https://issues.apache.org/jira/browse/HIVE-26804
> Project: Hive
>  Issue Type: New Feature
>  Components: Hive
>Reporter: KIRTI RUGE
>Assignee: KIRTI RUGE
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26804) Cancel Compactions in initiated state

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26804?focusedWorklogId=839569=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839569
 ]

ASF GitHub Bot logged work on HIVE-26804:
-

Author: ASF GitHub Bot
Created on: 17/Jan/23 10:18
Start Date: 17/Jan/23 10:18
Worklog Time Spent: 10m 
  Work Description: rkirtir commented on code in PR #3880:
URL: https://github.com/apache/hive/pull/3880#discussion_r1072023166


##
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnQueries.java:
##
@@ -50,4 +50,20 @@ public class TxnQueries {
 "  \"CC_HIGHEST_WRITE_ID\"" +
 "FROM " +
 "  \"COMPLETED_COMPACTIONS\" ) XX ";
+
+
+  public static final String SELECT_COMPACTION_QUEUE_BY_COMPID = "SELECT 
\"CQ_ID\", \"CQ_DATABASE\", \"CQ_TABLE\", \"CQ_PARTITION\", "
++ "\"CQ_STATE\", \"CQ_TYPE\", \"CQ_TBLPROPERTIES\", \"CQ_WORKER_ID\", 
\"CQ_START\", \"CQ_RUN_AS\", "
++ "\"CQ_HIGHEST_WRITE_ID\", \"CQ_META_INFO\", \"CQ_HADOOP_JOB_ID\", 
\"CQ_ERROR_MESSAGE\", "
++ "\"CQ_ENQUEUE_TIME\", \"CQ_WORKER_VERSION\", \"CQ_INITIATOR_ID\", 
\"CQ_INITIATOR_VERSION\", "
++ "\"CQ_RETRY_RETENTION\", \"CQ_NEXT_TXN_ID\", \"CQ_TXN_ID\", 
\"CQ_COMMIT_TIME\", \"CQ_POOL_NAME\" "
++ "FROM \"COMPACTION_QUEUE\" WHERE \"CQ_ID\" = ? AND \"CQ_STATE\" ='i'";
+
+  public static final String INSERT_INTO_COMPLETED_COMPACTION = "INSERT INTO 
\"COMPLETED_COMPACTIONS\" "

Review Comment:
   fixed





Issue Time Tracking
---

Worklog Id: (was: 839569)
Time Spent: 1h 20m  (was: 1h 10m)

> Cancel Compactions in initiated state
> -
>
> Key: HIVE-26804
> URL: https://issues.apache.org/jira/browse/HIVE-26804
> Project: Hive
>  Issue Type: New Feature
>  Components: Hive
>Reporter: KIRTI RUGE
>Assignee: KIRTI RUGE
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26804) Cancel Compactions in initiated state

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26804?focusedWorklogId=839567=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839567
 ]

ASF GitHub Bot logged work on HIVE-26804:
-

Author: ASF GitHub Bot
Created on: 17/Jan/23 10:17
Start Date: 17/Jan/23 10:17
Worklog Time Spent: 10m 
  Work Description: rkirtir commented on code in PR #3880:
URL: https://github.com/apache/hive/pull/3880#discussion_r1072022432


##
standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift:
##
@@ -1393,6 +1393,22 @@ struct ShowCompactResponse {
 1: required list compacts,
 }
 
+struct AbortCompactionRequest {
+1: required list compactionIds,
+2: optional string type,
+3: optional string poolName
+}
+
+struct AbortCompactionResponseElement {
+1: required i64 compactionIds,

Review Comment:
   fixed



##
ql/src/java/org/apache/hadoop/hive/ql/ddl/process/abort/compaction/AbortCompactionsOperation.java:
##
@@ -0,0 +1,81 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.ddl.process.abort.compaction;
+
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.metastore.api.AbortCompactResponse;
+import org.apache.hadoop.hive.metastore.api.AbortCompactionRequest;
+import org.apache.hadoop.hive.metastore.api.AbortCompactionResponseElement;
+import org.apache.hadoop.hive.ql.ddl.DDLOperation;
+import org.apache.hadoop.hive.ql.ddl.DDLOperationContext;
+import org.apache.hadoop.hive.ql.ddl.ShowUtils;
+import org.apache.hadoop.hive.ql.exec.Utilities;
+import org.apache.hadoop.hive.ql.metadata.HiveException;
+
+import java.io.DataOutputStream;
+import java.io.IOException;
+
+
+/**
+ * Operation process of aborting compactions.
+ */
+public class AbortCompactionsOperation extends 
DDLOperation {
+public AbortCompactionsOperation(DDLOperationContext context, 
AbortCompactionsDesc desc) {
+super(context, desc);
+}
+
+@Override
+public int execute() throws HiveException {
+AbortCompactionRequest request = new AbortCompactionRequest();
+request.setCompactionIds(desc.getCompactionIds());
+AbortCompactResponse response = 
context.getDb().abortCompactions(request);
+try (DataOutputStream os = ShowUtils.getOutputStream(new 
Path(desc.getResFile()), context)) {
+writeHeader(os);
+if (response.getAbortedcompacts() != null) {
+for (AbortCompactionResponseElement e : 
response.getAbortedcompacts()) {
+writeRow(os, e);
+}
+}
+} catch (Exception e) {
+LOG.warn("show compactions: ", e);

Review Comment:
   fixed





Issue Time Tracking
---

Worklog Id: (was: 839567)
Time Spent: 1h  (was: 50m)

> Cancel Compactions in initiated state
> -
>
> Key: HIVE-26804
> URL: https://issues.apache.org/jira/browse/HIVE-26804
> Project: Hive
>  Issue Type: New Feature
>  Components: Hive
>Reporter: KIRTI RUGE
>Assignee: KIRTI RUGE
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26904) QueryCompactor failed in commitCompaction if the tmp table dir is already removed

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26904?focusedWorklogId=839565=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839565
 ]

ASF GitHub Bot logged work on HIVE-26904:
-

Author: ASF GitHub Bot
Created on: 17/Jan/23 10:16
Start Date: 17/Jan/23 10:16
Worklog Time Spent: 10m 
  Work Description: stiga-huang commented on code in PR #3910:
URL: https://github.com/apache/hive/pull/3910#discussion_r1072021489


##
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/QueryCompactor.java:
##
@@ -245,16 +243,32 @@ private static void disableLlapCaching(HiveConf conf) {
  * @throws IOException the directory cannot be deleted
  * @throws HiveException the table is not found
  */
-static void cleanupEmptyDir(HiveConf conf, String tmpTableName) throws 
IOException, HiveException {
+static void cleanupEmptyTableDir(HiveConf conf, String tmpTableName)
+throws IOException, HiveException {
   org.apache.hadoop.hive.ql.metadata.Table tmpTable = 
Hive.get().getTable(tmpTableName);
   if (tmpTable != null) {
-Path path = new Path(tmpTable.getSd().getLocation());
-FileSystem fs = path.getFileSystem(conf);
+cleanupEmptyDir(conf, new Path(tmpTable.getSd().getLocation()));
+  }
+}
+
+/**
+ * Remove the directory if it's empty.
+ * @param conf the Hive configuration
+ * @param path path of the directory
+ * @throws IOException if any IO error occurs
+ */
+static void cleanupEmptyDir(HiveConf conf, Path path) throws IOException {
+  FileSystem fs = path.getFileSystem(conf);
+  try {
 if (!fs.listFiles(path, false).hasNext()) {
   fs.delete(path, true);
 }
+  } catch (FileNotFoundException e) {
+// Ignore the case when the dir was already removed
+LOG.warn("Ignored exception during cleanup {}", path, e);

Review Comment:
   It could be deleted before `listFiles()`. The `FileNotFoundException` is 
thrown from `listFiles()`.





Issue Time Tracking
---

Worklog Id: (was: 839565)
Time Spent: 40m  (was: 0.5h)

> QueryCompactor failed in commitCompaction if the tmp table dir is already 
> removed 
> --
>
> Key: HIVE-26904
> URL: https://issues.apache.org/jira/browse/HIVE-26904
> Project: Hive
>  Issue Type: Bug
>Reporter: Quanlong Huang
>Assignee: Quanlong Huang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> commitCompaction() of query-based compactions just remove the dirs of tmp 
> tables. It should not fail the compaction if the dirs are already removed.
> We've seen such a failure in Impala's test (IMPALA-11756):
> {noformat}
> 2023-01-02T02:09:26,306  INFO [HiveServer2-Background-Pool: Thread-695] 
> ql.Driver: Executing 
> command(queryId=jenkins_20230102020926_69112755-b783-4214-89e5-1c7111dfe15f): 
> alter table partial_catalog_info_test.insert_only_partitioned partition 
> (part=1) compact 'minor' and wait
> 2023-01-02T02:09:26,306  INFO [HiveServer2-Background-Pool: Thread-695] 
> ql.Driver: Starting task [Stage-0:DDL] in serial mode
> 2023-01-02T02:09:26,317  INFO [HiveServer2-Background-Pool: Thread-695] 
> exec.Task: Compaction enqueued with id 15
> ...
> 2023-01-02T02:12:55,849 ERROR 
> [impala-ec2-centos79-m6i-4xlarge-ondemand-1428.vpc.cloudera.com-48_executor] 
> compactor.Worker: Caught exception while trying to compact 
> id:15,dbname:partial_catalog_info_test,tableName:insert_only_partitioned,partName:part=1,state:^@,type:MINOR,enqueueTime:0,start:0,properties:null,runAs:jenkins,tooManyAborts:false,hasOldAbort:false,highestWriteId:3,errorMessage:null,workerId:
>  null,initiatorId: null,retryRetention0. Marking failed to avoid repeated 
> failures
> java.io.FileNotFoundException: File 
> hdfs://localhost:20500/tmp/hive/jenkins/092b533a-81c8-4b95-88e4-9472cf6f365d/_tmp_space.db/62ec04fb-e2d2-4a99-a454-ae709a3cccfe
>  does not exist.
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1275)
>  ~[hadoop-hdfs-client-3.1.1.7.2.15.4-6.jar:?]
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1249)
>  ~[hadoop-hdfs-client-3.1.1.7.2.15.4-6.jar:?]
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1194)
>  ~[hadoop-hdfs-client-3.1.1.7.2.15.4-6.jar:?]
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1190)
>  ~[hadoop-hdfs-client-3.1.1.7.2.15.4-6.jar:?]
>         at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>  

[jira] [Assigned] (HIVE-26954) Upgrade Avro to 1.11.1

2023-01-17 Thread Akshat Mathur (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akshat Mathur reassigned HIVE-26954:



> Upgrade Avro to 1.11.1
> --
>
> Key: HIVE-26954
> URL: https://issues.apache.org/jira/browse/HIVE-26954
> Project: Hive
>  Issue Type: Improvement
>Reporter: Akshat Mathur
>Assignee: Akshat Mathur
>Priority: Major
>
> Upgrade Avro dependencies to 1.11.1



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26793) Create a new configuration to override "no compaction" for tables

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26793?focusedWorklogId=839559=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839559
 ]

ASF GitHub Bot logged work on HIVE-26793:
-

Author: ASF GitHub Bot
Created on: 17/Jan/23 10:03
Start Date: 17/Jan/23 10:03
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3822:
URL: https://github.com/apache/hive/pull/3822#discussion_r1072005883


##
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java:
##
@@ -613,8 +613,9 @@ private boolean isEligibleForCompaction(CompactionInfo ci,
 return false;
   }
 
-  if (isNoAutoCompactSet(t.getParameters())) {
-LOG.info("Table " + tableName(t) + " marked " + 
hive_metastoreConstants.TABLE_NO_AUTO_COMPACT +
+  Map dbParams = computeIfAbsent(ci.dbname, () -> 
resolveDatabase(ci)).getParameters();

Review Comment:
   
   if (replIsCompactionDisabledForTable(t)) {
   skipTables.add(ci.getFullTableName());
   return false;
   }
   
   Would be better if we could refactor the method and use a cache of 
skipTables/skipDBs instead of doing the same evaluation (isNoAutoCompactSet) 
for every Table in skipped Db / every Partition of skipped Table  





Issue Time Tracking
---

Worklog Id: (was: 839559)
Time Spent: 3h 50m  (was: 3h 40m)

> Create a new configuration to override "no compaction" for tables
> -
>
> Key: HIVE-26793
> URL: https://issues.apache.org/jira/browse/HIVE-26793
> Project: Hive
>  Issue Type: Improvement
>Reporter: Kokila N
>Assignee: Kokila N
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> Currently a simple user can create a table with 
> {color:#6a8759}no_auto_compaction=true{color} table property and create an 
> aborted write transaction writing to this table. This way a malicious user 
> can prevent cleaning up data for the aborted transaction, creating 
> performance degradation.
> This configuration should be allowed to overridden on a database level: 
> adding {color:#6a8759}no_auto_compaction=false{color} should override the 
> table level setting forcing the initiator to schedule compaction for all 
> tables.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26793) Create a new configuration to override "no compaction" for tables

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26793?focusedWorklogId=839558=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839558
 ]

ASF GitHub Bot logged work on HIVE-26793:
-

Author: ASF GitHub Bot
Created on: 17/Jan/23 10:02
Start Date: 17/Jan/23 10:02
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3822:
URL: https://github.com/apache/hive/pull/3822#discussion_r1072005883


##
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java:
##
@@ -613,8 +613,9 @@ private boolean isEligibleForCompaction(CompactionInfo ci,
 return false;
   }
 
-  if (isNoAutoCompactSet(t.getParameters())) {
-LOG.info("Table " + tableName(t) + " marked " + 
hive_metastoreConstants.TABLE_NO_AUTO_COMPACT +
+  Map dbParams = computeIfAbsent(ci.dbname, () -> 
resolveDatabase(ci)).getParameters();

Review Comment:
   
   if (replIsCompactionDisabledForTable(t)) {
   skipTables.add(ci.getFullTableName());
   return false;
   }
   
   Would be better if we could refactor the method and use a cache of 
skipTables/skipDBs instead of doing the same evaluation for every Table in 
skipped Db / every Partition of skipped Table  





Issue Time Tracking
---

Worklog Id: (was: 839558)
Time Spent: 3h 40m  (was: 3.5h)

> Create a new configuration to override "no compaction" for tables
> -
>
> Key: HIVE-26793
> URL: https://issues.apache.org/jira/browse/HIVE-26793
> Project: Hive
>  Issue Type: Improvement
>Reporter: Kokila N
>Assignee: Kokila N
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> Currently a simple user can create a table with 
> {color:#6a8759}no_auto_compaction=true{color} table property and create an 
> aborted write transaction writing to this table. This way a malicious user 
> can prevent cleaning up data for the aborted transaction, creating 
> performance degradation.
> This configuration should be allowed to overridden on a database level: 
> adding {color:#6a8759}no_auto_compaction=false{color} should override the 
> table level setting forcing the initiator to schedule compaction for all 
> tables.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26904) QueryCompactor failed in commitCompaction if the tmp table dir is already removed

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26904?focusedWorklogId=839551=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839551
 ]

ASF GitHub Bot logged work on HIVE-26904:
-

Author: ASF GitHub Bot
Created on: 17/Jan/23 09:55
Start Date: 17/Jan/23 09:55
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3910:
URL: https://github.com/apache/hive/pull/3910#discussion_r1071993848


##
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/QueryCompactor.java:
##
@@ -245,16 +243,32 @@ private static void disableLlapCaching(HiveConf conf) {
  * @throws IOException the directory cannot be deleted
  * @throws HiveException the table is not found
  */
-static void cleanupEmptyDir(HiveConf conf, String tmpTableName) throws 
IOException, HiveException {
+static void cleanupEmptyTableDir(HiveConf conf, String tmpTableName)
+throws IOException, HiveException {
   org.apache.hadoop.hive.ql.metadata.Table tmpTable = 
Hive.get().getTable(tmpTableName);
   if (tmpTable != null) {
-Path path = new Path(tmpTable.getSd().getLocation());
-FileSystem fs = path.getFileSystem(conf);
+cleanupEmptyDir(conf, new Path(tmpTable.getSd().getLocation()));
+  }
+}
+
+/**
+ * Remove the directory if it's empty.
+ * @param conf the Hive configuration
+ * @param path path of the directory
+ * @throws IOException if any IO error occurs
+ */
+static void cleanupEmptyDir(HiveConf conf, Path path) throws IOException {
+  FileSystem fs = path.getFileSystem(conf);
+  try {
 if (!fs.listFiles(path, false).hasNext()) {
   fs.delete(path, true);
 }
+  } catch (FileNotFoundException e) {
+// Ignore the case when the dir was already removed
+LOG.warn("Ignored exception during cleanup {}", path, e);

Review Comment:
   tmpDir gets deleted between the listing and the actual delete command?





Issue Time Tracking
---

Worklog Id: (was: 839551)
Time Spent: 0.5h  (was: 20m)

> QueryCompactor failed in commitCompaction if the tmp table dir is already 
> removed 
> --
>
> Key: HIVE-26904
> URL: https://issues.apache.org/jira/browse/HIVE-26904
> Project: Hive
>  Issue Type: Bug
>Reporter: Quanlong Huang
>Assignee: Quanlong Huang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> commitCompaction() of query-based compactions just remove the dirs of tmp 
> tables. It should not fail the compaction if the dirs are already removed.
> We've seen such a failure in Impala's test (IMPALA-11756):
> {noformat}
> 2023-01-02T02:09:26,306  INFO [HiveServer2-Background-Pool: Thread-695] 
> ql.Driver: Executing 
> command(queryId=jenkins_20230102020926_69112755-b783-4214-89e5-1c7111dfe15f): 
> alter table partial_catalog_info_test.insert_only_partitioned partition 
> (part=1) compact 'minor' and wait
> 2023-01-02T02:09:26,306  INFO [HiveServer2-Background-Pool: Thread-695] 
> ql.Driver: Starting task [Stage-0:DDL] in serial mode
> 2023-01-02T02:09:26,317  INFO [HiveServer2-Background-Pool: Thread-695] 
> exec.Task: Compaction enqueued with id 15
> ...
> 2023-01-02T02:12:55,849 ERROR 
> [impala-ec2-centos79-m6i-4xlarge-ondemand-1428.vpc.cloudera.com-48_executor] 
> compactor.Worker: Caught exception while trying to compact 
> id:15,dbname:partial_catalog_info_test,tableName:insert_only_partitioned,partName:part=1,state:^@,type:MINOR,enqueueTime:0,start:0,properties:null,runAs:jenkins,tooManyAborts:false,hasOldAbort:false,highestWriteId:3,errorMessage:null,workerId:
>  null,initiatorId: null,retryRetention0. Marking failed to avoid repeated 
> failures
> java.io.FileNotFoundException: File 
> hdfs://localhost:20500/tmp/hive/jenkins/092b533a-81c8-4b95-88e4-9472cf6f365d/_tmp_space.db/62ec04fb-e2d2-4a99-a454-ae709a3cccfe
>  does not exist.
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1275)
>  ~[hadoop-hdfs-client-3.1.1.7.2.15.4-6.jar:?]
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1249)
>  ~[hadoop-hdfs-client-3.1.1.7.2.15.4-6.jar:?]
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1194)
>  ~[hadoop-hdfs-client-3.1.1.7.2.15.4-6.jar:?]
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1190)
>  ~[hadoop-hdfs-client-3.1.1.7.2.15.4-6.jar:?]
>         at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>  ~[hadoop-common-3.1.1.7.2.15.4-6.jar:?]
>         

[jira] [Work logged] (HIVE-22977) Merge delta files instead of running a query in major/minor compaction

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22977?focusedWorklogId=839547=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839547
 ]

ASF GitHub Bot logged work on HIVE-22977:
-

Author: ASF GitHub Bot
Created on: 17/Jan/23 09:43
Start Date: 17/Jan/23 09:43
Worklog Time Spent: 10m 
  Work Description: sonarcloud[bot] commented on PR #3801:
URL: https://github.com/apache/hive/pull/3801#issuecomment-1385108588

   Kudos, SonarCloud Quality Gate passed!  [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive=3801)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=BUG)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=BUG)
 [0 
Bugs](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3801=false=SECURITY_HOTSPOT)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3801=false=SECURITY_HOTSPOT)
 [0 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3801=false=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=CODE_SMELL)
 [2 Code 
Smells](https://sonarcloud.io/project/issues?id=apache_hive=3801=false=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive=3801=coverage=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive=3801=duplicated_lines_density=list)
 No Duplication information
   
   




Issue Time Tracking
---

Worklog Id: (was: 839547)
Time Spent: 4.5h  (was: 4h 20m)

> Merge delta files instead of running a query in major/minor compaction
> --
>
> Key: HIVE-22977
> URL: https://issues.apache.org/jira/browse/HIVE-22977
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Pintér
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22977.01.patch, HIVE-22977.02.patch
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> [Compaction Optimiziation]
> We should analyse the possibility to move a delta file instead of running a 
> major/minor compaction query.
> Please consider the following use cases:
>  - full acid table but only insert queries were run. This means that no 
> delete delta directories were created. Is it possible to merge the delta 
> directory contents without running a compaction query?
>  - full acid table, initiating queries through the streaming API. If there 
> are no abort transactions during the streaming, is it possible to merge the 
> delta directory contents without running a compaction query?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26943) Fix NPE during Optimised Bootstrap when db is dropped

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26943?focusedWorklogId=839546=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839546
 ]

ASF GitHub Bot logged work on HIVE-26943:
-

Author: ASF GitHub Bot
Created on: 17/Jan/23 09:42
Start Date: 17/Jan/23 09:42
Worklog Time Spent: 10m 
  Work Description: sonarcloud[bot] commented on PR #3953:
URL: https://github.com/apache/hive/pull/3953#issuecomment-1385107300

   Kudos, SonarCloud Quality Gate passed!  [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive=3953)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=3953=false=BUG)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=3953=false=BUG)
 [0 
Bugs](https://sonarcloud.io/project/issues?id=apache_hive=3953=false=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=3953=false=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=3953=false=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=3953=false=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3953=false=SECURITY_HOTSPOT)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3953=false=SECURITY_HOTSPOT)
 [0 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3953=false=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive=3953=false=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=3953=false=CODE_SMELL)
 [0 Code 
Smells](https://sonarcloud.io/project/issues?id=apache_hive=3953=false=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive=3953=coverage=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive=3953=duplicated_lines_density=list)
 No Duplication information
   
   




Issue Time Tracking
---

Worklog Id: (was: 839546)
Time Spent: 40m  (was: 0.5h)

> Fix NPE during Optimised Bootstrap when db is dropped
> -
>
> Key: HIVE-26943
> URL: https://issues.apache.org/jira/browse/HIVE-26943
> Project: Hive
>  Issue Type: Task
>Reporter: Shreenidhi
>Assignee: Shreenidhi
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Consider the steps:
> 1. Current replication is from A (source) -> B(target)
> 2. Failover is complete
> so now           A (target) <- B(source)
> 3. Suppose db at A is dropped before reverse replication.
> 4. Now when reverse replication triggers optimised bootstrap it will throw NPE
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-26953) Exception in alter partitions with oracle db when partitions are more than 1000. Exception: ORA-01795: maximum number of expressions in a list is 1000

2023-01-17 Thread Venugopal Reddy K (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venugopal Reddy K updated HIVE-26953:
-
Attachment: (was: partdata1001-1)

> Exception in alter partitions with oracle db when partitions are more than 
> 1000. Exception: ORA-01795: maximum number of expressions in a list is 1000
> --
>
> Key: HIVE-26953
> URL: https://issues.apache.org/jira/browse/HIVE-26953
> Project: Hive
>  Issue Type: Bug
>Reporter: Venugopal Reddy K
>Priority: Major
> Attachments: partdata1001
>
>
> *[Description]* 
> Alter partitions with oracle db  throws exception when the number of 
> partitions are more than 1000. Oracle db has limitation on number of values 
> passed in the IN operator. It cannot exceed 1000.
> *[Steps to reproduce]* 
> Create stage table, load data that has 1000+ rows into stage table, create 
> partition table and load data into the table from the stage table. data 
> file[^partdata1001] is attached below.
>  
> {code:java}
> 0: jdbc:hive2://localhost:1> create database mydb;
> 0: jdbc:hive2://localhost:1> use mydb;
>  
> 0: jdbc:hive2://localhost:1> create table stage(sr int, st string, name 
> string) row format delimited fields terminated by '\t' stored as textfile;
>  
> 0: jdbc:hive2://localhost:1> load data local inpath 'partdata1001' into 
> table stage;
>  
> 0: jdbc:hive2://localhost:1> create table dynpart(num int, name string) 
> partitioned by (category string) row format delimited fields terminated by 
> '\t' stored as textfile;
>  
> 0: jdbc:hive2://localhost:1> insert into dynpart select * from stage;
> {code}
>  
> *Alter partition throws exception(ORA-01795: maximum number of expressions in 
> a list is 1000) during BasicStatsTask.aggregateStats. This issue occurs with 
> oracle db due to its limitation of number of values in the IN operator.*
> *[Exception Stack]*
>  
> {code:java}
> NestedThrowables:
> java.sql.SQLSyntaxErrorException: ORA-01795: maximum number of expressions in 
> a list is 1000
>     at org.apache.hadoop.hive.ql.metadata.Hive.alterPartitions(Hive.java:1145)
>     at 
> org.apache.hadoop.hive.ql.stats.BasicStatsTask.aggregateStats(BasicStatsTask.java:380)
>     at 
> org.apache.hadoop.hive.ql.stats.BasicStatsTask.process(BasicStatsTask.java:108)
>     at org.apache.hadoop.hive.ql.exec.StatsTask.execute(StatsTask.java:107)
>     at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:214)
>     at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105)
>     at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:354)
>     at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:327)
>     at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:244)
>     at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:105)
>     at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:370)
>     at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:205)
>     at org.apache.hadoop.hive.ql.Driver.run(Driver.java:154)
>     at org.apache.hadoop.hive.ql.Driver.run(Driver.java:149)
>     at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:185)
>     at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:236)
>     at 
> org.apache.hive.service.cli.operation.SQLOperation.access$500(SQLOperation.java:90)
>     at 
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:340)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
>     at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
>     at 
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:360)
>     at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>     at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>     at java.lang.Thread.run(Thread.java:748)
> Caused by: MetaException(message:javax.jdo.JDOException: Exception thrown 
> when executing query : SELECT DISTINCT 
> 'org.apache.hadoop.hive.metastore.model.MPartition' AS 
> DN_TYPE,A0.CREATE_TIME,A0.LAST_ACCESS_TIME,A0.PART_NAME,A0.WRITE_ID,A0.PART_ID
>  FROM PARTITIONS A0 LEFT OUTER JOIN TBLS B0 ON A0.TBL_ID = B0.TBL_ID LEFT 
> OUTER JOIN DBS C0 ON B0.DB_ID = C0.DB_ID WHERE B0.TBL_NAME = ? AND C0."NAME" 
> = ? AND A0.PART_NAME 
> 

[jira] [Updated] (HIVE-26953) Exception in alter partitions with oracle db when partitions are more than 1000. Exception: ORA-01795: maximum number of expressions in a list is 1000

2023-01-17 Thread Venugopal Reddy K (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venugopal Reddy K updated HIVE-26953:
-
 Attachment: partdata1001-1
 partdata1001
Description: 
*[Description]* 

Alter partitions with oracle db  throws exception when the number of partitions 
are more than 1000. Oracle db has limitation on number of values passed in the 
IN operator. It cannot exceed 1000.

*[Steps to reproduce]* 

Create stage table, load data that has 1000+ rows into stage table, create 
partition table and load data into the table from the stage table. data 
file[^partdata1001] is attached below.

 
{code:java}
0: jdbc:hive2://localhost:1> create database mydb;
0: jdbc:hive2://localhost:1> use mydb;
 
0: jdbc:hive2://localhost:1> create table stage(sr int, st string, name 
string) row format delimited fields terminated by '\t' stored as textfile;
 
0: jdbc:hive2://localhost:1> load data local inpath 'partdata1001' into 
table stage;
 
0: jdbc:hive2://localhost:1> create table dynpart(num int, name string) 
partitioned by (category string) row format delimited fields terminated by '\t' 
stored as textfile;
 
0: jdbc:hive2://localhost:1> insert into dynpart select * from stage;
{code}
 

*Alter partition throws exception(ORA-01795: maximum number of expressions in a 
list is 1000) during BasicStatsTask.aggregateStats. This issue occurs with 
oracle db due to its limitation of number of values in the IN operator.*

*[Exception Stack]*

 
{code:java}
NestedThrowables:
java.sql.SQLSyntaxErrorException: ORA-01795: maximum number of expressions in a 
list is 1000
    at org.apache.hadoop.hive.ql.metadata.Hive.alterPartitions(Hive.java:1145)
    at 
org.apache.hadoop.hive.ql.stats.BasicStatsTask.aggregateStats(BasicStatsTask.java:380)
    at 
org.apache.hadoop.hive.ql.stats.BasicStatsTask.process(BasicStatsTask.java:108)
    at org.apache.hadoop.hive.ql.exec.StatsTask.execute(StatsTask.java:107)
    at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:214)
    at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105)
    at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:354)
    at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:327)
    at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:244)
    at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:105)
    at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:370)
    at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:205)
    at org.apache.hadoop.hive.ql.Driver.run(Driver.java:154)
    at org.apache.hadoop.hive.ql.Driver.run(Driver.java:149)
    at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:185)
    at 
org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:236)
    at 
org.apache.hive.service.cli.operation.SQLOperation.access$500(SQLOperation.java:90)
    at 
org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:340)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
    at 
org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:360)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
Caused by: MetaException(message:javax.jdo.JDOException: Exception thrown when 
executing query : SELECT DISTINCT 
'org.apache.hadoop.hive.metastore.model.MPartition' AS 
DN_TYPE,A0.CREATE_TIME,A0.LAST_ACCESS_TIME,A0.PART_NAME,A0.WRITE_ID,A0.PART_ID 
FROM PARTITIONS A0 LEFT OUTER JOIN TBLS B0 ON A0.TBL_ID = B0.TBL_ID LEFT OUTER 
JOIN DBS C0 ON B0.DB_ID = C0.DB_ID WHERE B0.TBL_NAME = ? AND C0."NAME" = ? AND 
A0.PART_NAME 

[jira] [Assigned] (HIVE-26950) (CTLT) Create external table like V2 table is not preserving table properties

2023-01-17 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena reassigned HIVE-26950:
---

Assignee: Ayush Saxena

> (CTLT) Create external table like V2 table is not preserving table properties
> -
>
> Key: HIVE-26950
> URL: https://issues.apache.org/jira/browse/HIVE-26950
> Project: Hive
>  Issue Type: Improvement
>  Components: Iceberg integration
>Reporter: Rajesh Balamohan
>Assignee: Ayush Saxena
>Priority: Major
>
> # Create an external iceberg V2 table. e.g t1
>  # "create external table t2 like t1" <--- This ends up creating V1 table and 
> "format-version=2" is not retained and "'format'='iceberg/parquet'" is also 
> not retained.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-26933) Cleanup dump directory for eventId which was failed in previous dump cycle

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-26933:
--
Labels: pull-request-available  (was: )

> Cleanup dump directory for eventId which was failed in previous dump cycle
> --
>
> Key: HIVE-26933
> URL: https://issues.apache.org/jira/browse/HIVE-26933
> Project: Hive
>  Issue Type: Improvement
>Reporter: Harshal Patel
>Assignee: Harshal Patel
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> # If Incremental Dump operation failes while dumping any event id  in the 
> staging directory. Then dump directory for this event id along with file 
> _dumpmetadata  still exists in the dump location. which is getting stored in 
> _events_dump file
>  # When user triggers dump operation for this policy again, It again resumes 
> dumping from failed event id, and tries to dump it again but as that event id 
> directory already created in previous cycle, it fails with the exception
> {noformat}
> [Scheduled Query Executor(schedule:repl_policytest7, execution_id:7181)]: 
> FAILED: Execution Error, return code 4 from 
> org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask. 
> org.apache.hadoop.fs.FileAlreadyExistsException: 
> /warehouse/tablespace/staging/policytest7/dGVzdDc=/14bcf976-662b-4237-b5bb-e7d63a1d089f/hive/137961/_dumpmetadata
>  for client 172.27.182.5 already exists
>     at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.startFile(FSDirWriteFileOp.java:388)
>     at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2576)
>     at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2473)
>     at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:773)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:490)
>     at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:533)
>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
>     at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:989)
>     at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:917)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
>     at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898)
>     at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2894){noformat}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26933) Cleanup dump directory for eventId which was failed in previous dump cycle

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26933?focusedWorklogId=839538=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839538
 ]

ASF GitHub Bot logged work on HIVE-26933:
-

Author: ASF GitHub Bot
Created on: 17/Jan/23 08:58
Start Date: 17/Jan/23 08:58
Worklog Time Spent: 10m 
  Work Description: harshal-16 opened a new pull request, #3960:
URL: https://github.com/apache/hive/pull/3960

   Problem:
- If Incremental Dump operation failes while dumping any event id  in 
the staging directory. Then dump directory for this event id along with file 
_dumpmetadata  still exists in the dump location. which is getting stored in 
_events_dump file
- When user triggers dump operation for this policy again, It again 
resumes dumping from failed event id, and tries to dump it again but as that 
event id directory already created in previous cycle, it fails with the 
exception 
   Solution:
- fixed  cleanFailedEventDirIfExists to remove folder for failed event 
id for a selected database




Issue Time Tracking
---

Worklog Id: (was: 839538)
Remaining Estimate: 0h
Time Spent: 10m

> Cleanup dump directory for eventId which was failed in previous dump cycle
> --
>
> Key: HIVE-26933
> URL: https://issues.apache.org/jira/browse/HIVE-26933
> Project: Hive
>  Issue Type: Improvement
>Reporter: Harshal Patel
>Assignee: Harshal Patel
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> # If Incremental Dump operation failes while dumping any event id  in the 
> staging directory. Then dump directory for this event id along with file 
> _dumpmetadata  still exists in the dump location. which is getting stored in 
> _events_dump file
>  # When user triggers dump operation for this policy again, It again resumes 
> dumping from failed event id, and tries to dump it again but as that event id 
> directory already created in previous cycle, it fails with the exception
> {noformat}
> [Scheduled Query Executor(schedule:repl_policytest7, execution_id:7181)]: 
> FAILED: Execution Error, return code 4 from 
> org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask. 
> org.apache.hadoop.fs.FileAlreadyExistsException: 
> /warehouse/tablespace/staging/policytest7/dGVzdDc=/14bcf976-662b-4237-b5bb-e7d63a1d089f/hive/137961/_dumpmetadata
>  for client 172.27.182.5 already exists
>     at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.startFile(FSDirWriteFileOp.java:388)
>     at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2576)
>     at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2473)
>     at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:773)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:490)
>     at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:533)
>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
>     at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:989)
>     at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:917)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
>     at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898)
>     at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2894){noformat}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26932) Correct stage name value in replication_metrics.progress column in replication_metrics table

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26932?focusedWorklogId=839537=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839537
 ]

ASF GitHub Bot logged work on HIVE-26932:
-

Author: ASF GitHub Bot
Created on: 17/Jan/23 08:56
Start Date: 17/Jan/23 08:56
Worklog Time Spent: 10m 
  Work Description: harshal-16 closed pull request #3958: HIVE-26932: 
Cleanup dump directory for eventId which was failed in previous dump cycle
URL: https://github.com/apache/hive/pull/3958




Issue Time Tracking
---

Worklog Id: (was: 839537)
Time Spent: 40m  (was: 0.5h)

> Correct stage name value in replication_metrics.progress column in 
> replication_metrics table
> 
>
> Key: HIVE-26932
> URL: https://issues.apache.org/jira/browse/HIVE-26932
> Project: Hive
>  Issue Type: Improvement
>Reporter: Harshal Patel
>Assignee: Harshal Patel
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
>  To improve diagnostic capability from Source to backup replication, update 
> replication_metrics table by adding pre_optimized_bootstrap in progress bar 
> in case of optimized bootstrap first cycle.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26952) set the value of metastore.storage.schema.reader.impl
 to org.apache.hadoop.hive.metastore.SerDeStorageSchemaReader as default

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26952?focusedWorklogId=839535=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839535
 ]

ASF GitHub Bot logged work on HIVE-26952:
-

Author: ASF GitHub Bot
Created on: 17/Jan/23 08:44
Start Date: 17/Jan/23 08:44
Worklog Time Spent: 10m 
  Work Description: tarak271 opened a new pull request, #3959:
URL: https://github.com/apache/hive/pull/3959

   …o org.apache.hadoop.hive.metastore.SerDeStorageSchemaReader as default
   
   
   ### What changes were proposed in this pull request?
   
   set the value of the config metastore.storage.schema.reader.impl
 to 
org.apache.hadoop.hive.metastore.SerDeStorageSchemaReader as default. 
   
   ### Why are the changes needed?
   
   Previously it was set to DefaultStorageSchemaReader with default message as 
"Storage schema reading not supported". SerDeStorageSchemaReader was introduced 
with implementation that can help read schema from storage. So proposing to 
make it as default value to avoid setting this config by users in future 
releases
   
   ### Does this PR introduce _any_ user-facing change?
   
   No
   
   ### How was this patch tested?
   
   There is no functionality change introduced, so existing test cases should 
not be failing with these config value changes




Issue Time Tracking
---

Worklog Id: (was: 839535)
Remaining Estimate: 0h
Time Spent: 10m

> set the value of metastore.storage.schema.reader.impl
 to 
> org.apache.hadoop.hive.metastore.SerDeStorageSchemaReader as default
> --
>
> Key: HIVE-26952
> URL: https://issues.apache.org/jira/browse/HIVE-26952
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Reporter: Taraka Rama Rao Lethavadla
>Assignee: Taraka Rama Rao Lethavadla
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> With the default value of
>  
> {code:java}
> DefaultStorageSchemaReader.class.getName(){code}
>  
> in the Metastore Config, *metastore.storage.schema.reader.impl*
> below exception is thrown when trying to read Avro schema
> {noformat}
> Caused by: org.apache.hive.service.cli.HiveSQLException: MetaException 
> (message:java.lang.UnsupportedOperationException: Storage schema reading not 
> supported)
>     at 
> org.apache.hive.service.cli.operation.GetColumnsOperation.runInternal(GetColumnsOperation.java:213)
>     at org.apache.hive.service.cli.operation.Operation.run(Operation.java:247)
>     at 
> org.apache.hive.service.cli.session.HiveSessionImpl.getColumns(HiveSessionImpl.java:729)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
>     at 
> org.apache.hive.service.cli.session.HiveSessionProxy.access-zsh(HiveSessionProxy.java:36)
>     at 
> org.apache.hive.service.cli.session.HiveSessionProxy.run(HiveSessionProxy.java:63)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
>     at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>     at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
>     at com.sun.proxy..getColumns(Unknown Source)
>     at 
> org.apache.hive.service.cli.CLIService.getColumns(CLIService.java:390){noformat}
> setting the above config with 
> *org.apache.hadoop.hive.metastore.SerDeStorageSchemaReader* resolves issue
> Proposing to make this value as default in code base, so that in upcoming 
> versions we don't have to set this value manually



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-26952) set the value of metastore.storage.schema.reader.impl
 to org.apache.hadoop.hive.metastore.SerDeStorageSchemaReader as default

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-26952:
--
Labels: pull-request-available  (was: )

> set the value of metastore.storage.schema.reader.impl
 to 
> org.apache.hadoop.hive.metastore.SerDeStorageSchemaReader as default
> --
>
> Key: HIVE-26952
> URL: https://issues.apache.org/jira/browse/HIVE-26952
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Reporter: Taraka Rama Rao Lethavadla
>Assignee: Taraka Rama Rao Lethavadla
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> With the default value of
>  
> {code:java}
> DefaultStorageSchemaReader.class.getName(){code}
>  
> in the Metastore Config, *metastore.storage.schema.reader.impl*
> below exception is thrown when trying to read Avro schema
> {noformat}
> Caused by: org.apache.hive.service.cli.HiveSQLException: MetaException 
> (message:java.lang.UnsupportedOperationException: Storage schema reading not 
> supported)
>     at 
> org.apache.hive.service.cli.operation.GetColumnsOperation.runInternal(GetColumnsOperation.java:213)
>     at org.apache.hive.service.cli.operation.Operation.run(Operation.java:247)
>     at 
> org.apache.hive.service.cli.session.HiveSessionImpl.getColumns(HiveSessionImpl.java:729)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
>     at 
> org.apache.hive.service.cli.session.HiveSessionProxy.access-zsh(HiveSessionProxy.java:36)
>     at 
> org.apache.hive.service.cli.session.HiveSessionProxy.run(HiveSessionProxy.java:63)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
>     at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>     at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
>     at com.sun.proxy..getColumns(Unknown Source)
>     at 
> org.apache.hive.service.cli.CLIService.getColumns(CLIService.java:390){noformat}
> setting the above config with 
> *org.apache.hadoop.hive.metastore.SerDeStorageSchemaReader* resolves issue
> Proposing to make this value as default in code base, so that in upcoming 
> versions we don't have to set this value manually



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26711) The very first REPL Load should make the Target Database read-only

2023-01-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26711?focusedWorklogId=839528=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-839528
 ]

ASF GitHub Bot logged work on HIVE-26711:
-

Author: ASF GitHub Bot
Created on: 17/Jan/23 08:08
Start Date: 17/Jan/23 08:08
Worklog Time Spent: 10m 
  Work Description: shreenidhiSaigaonkar commented on code in PR #3736:
URL: https://github.com/apache/hive/pull/3736#discussion_r1071876874


##
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplWithReadOnlyHook.java:
##
@@ -0,0 +1,126 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.parse;
+
+import static 
org.apache.hadoop.hive.ql.hooks.EnforceReadOnlyDatabaseHook.READONLY;
+import static org.apache.hadoop.hive.common.repl.ReplConst.READ_ONLY_HOOK;
+import static org.junit.Assert.assertEquals;
+
+import org.apache.hadoop.hdfs.MiniDFSCluster;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.conf.MetastoreConf;
+import 
org.apache.hadoop.hive.metastore.messaging.json.gzip.GzipJSONMessageEncoder;
+import org.apache.hadoop.hive.shims.Utils;
+import org.junit.After;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+import java.util.HashMap;
+import java.util.Map;
+
+public class TestReplWithReadOnlyHook extends 
BaseReplicationScenariosAcidTables {
+
+  @BeforeClass
+  public static void classLevelSetup() throws Exception {
+Map overrides = new HashMap<>();
+overrides.put(MetastoreConf.ConfVars.EVENT_MESSAGE_FACTORY.getHiveName(),
+  GzipJSONMessageEncoder.class.getCanonicalName());
+
+conf = new HiveConf(TestReplWithReadOnlyHook.class);
+conf.set("hadoop.proxyuser." + Utils.getUGI().getShortUserName() + 
".hosts", "*");
+
+MiniDFSCluster miniDFSCluster =
+  new MiniDFSCluster.Builder(conf).numDataNodes(2).format(true).build();
+
+Map acidEnableConf = new HashMap() {{

Review Comment:
   Done





Issue Time Tracking
---

Worklog Id: (was: 839528)
Time Spent: 1h 40m  (was: 1.5h)

> The very first REPL Load should make the Target Database read-only
> --
>
> Key: HIVE-26711
> URL: https://issues.apache.org/jira/browse/HIVE-26711
> Project: Hive
>  Issue Type: Task
>Reporter: Shreenidhi
>Assignee: Shreenidhi
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Use EnforceReadOnly hook to set TARGET database read only during BootStrap 
> load.
> Also ensure backward compatibility.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)