date:20230503

[jira] [Work logged] (HIVE-27234) Iceberg: CREATE BRANCH SQL implementation

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27234?focusedWorklogId=860466=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860466
 ]

ASF GitHub Bot logged work on HIVE-27234:
-

Author: ASF GitHub Bot
Created on: 04/May/23 05:56
Start Date: 04/May/23 05:56
Worklog Time Spent: 10m 
  Work Description: zhangbutao closed pull request #4216: HIVE-27234: 
Iceberg: CREATE BRANCH SQL implementation
URL: https://github.com/apache/hive/pull/4216




Issue Time Tracking
---

Worklog Id: (was: 860466)
Time Spent: 6h 50m  (was: 6h 40m)

> Iceberg:  CREATE BRANCH SQL implementation
> --
>
> Key: HIVE-27234
> URL: https://issues.apache.org/jira/browse/HIVE-27234
> Project: Hive
>  Issue Type: Sub-task
>  Components: Iceberg integration
>Reporter: zhangbutao
>Assignee: zhangbutao
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> Maybe we can follow spark sql about branch ddl implementation 
> [https://github.com/apache/iceberg/pull/6617]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-27234) Iceberg: CREATE BRANCH SQL implementation

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27234?focusedWorklogId=860467=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860467
 ]

ASF GitHub Bot logged work on HIVE-27234:
-

Author: ASF GitHub Bot
Created on: 04/May/23 05:56
Start Date: 04/May/23 05:56
Worklog Time Spent: 10m 
  Work Description: zhangbutao opened a new pull request, #4216:
URL: https://github.com/apache/hive/pull/4216

   
   
   ### What changes were proposed in this pull request?
   
   This PR refers to spark-sql  about iceberg branch ddl implementation 
https://github.com/apache/iceberg/pull/6617
   If someone has different opinions about the sql syntax of branch, we can 
discuss here.
   
   ### Why are the changes needed?
   
   Personally, branch is more useful than snapshot in iceberg, and it is more 
friendly to users. We can use branch do lots of meaningfull work. 
   
   ### Does this PR introduce _any_ user-facing change?
   
   Added a new sql syntax and hive users can create iceberg branch using the 
sql.
   ```
   ALTER TABLE tableName
   {CREATE BRANCH branchName [AS OF VERSION {snapshotId}]
   [RETAIN interval {DAYS | HOURS | MINUTES}]
   [WITH SNAPSHOT RETENTION {[num_snapshots SNAPSHOTS] [interval {DAYS | HOURS 
| MINUTES}]}]}]
   ```
   
   ### How was this patch tested?
   
   UT




Issue Time Tracking
---

Worklog Id: (was: 860467)
Time Spent: 7h  (was: 6h 50m)

> Iceberg:  CREATE BRANCH SQL implementation
> --
>
> Key: HIVE-27234
> URL: https://issues.apache.org/jira/browse/HIVE-27234
> Project: Hive
>  Issue Type: Sub-task
>  Components: Iceberg integration
>Reporter: zhangbutao
>Assignee: zhangbutao
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 7h
>  Remaining Estimate: 0h
>
> Maybe we can follow spark sql about branch ddl implementation 
> [https://github.com/apache/iceberg/pull/6617]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-27234) Iceberg: CREATE BRANCH SQL implementation

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27234?focusedWorklogId=860463=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860463
 ]

ASF GitHub Bot logged work on HIVE-27234:
-

Author: ASF GitHub Bot
Created on: 04/May/23 04:50
Start Date: 04/May/23 04:50
Worklog Time Spent: 10m 
  Work Description: zhangbutao commented on code in PR #4216:
URL: https://github.com/apache/hive/pull/4216#discussion_r1184537248


##
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java:
##
@@ -676,6 +678,35 @@ public void 
executeOperation(org.apache.hadoop.hive.ql.metadata.Table hmsTable,
 }
   }
 
+  @Override
+  public void createBranchOperation(org.apache.hadoop.hive.ql.metadata.Table 
hmsTable,
+  AlterTableCreateBranchSpec createBranchSpec) {
+TableDesc tableDesc = Utilities.getTableDesc(hmsTable);
+Table icebergTable = IcebergTableUtil.getTable(conf, 
tableDesc.getProperties());
+
+String branchName = createBranchSpec.getBranchName();
+Optional.ofNullable(icebergTable.currentSnapshot()).orElseThrow(() -> new 
UnsupportedOperationException(
+String.format("Cannot create branch %s on iceberg table %s.%s which 
has no snapshot",
+branchName, hmsTable.getDbName(), hmsTable.getTableName(;
+Long snapshotId = Optional.ofNullable(createBranchSpec.getSnapshotId())
+.orElse(icebergTable.currentSnapshot().snapshotId());
+LOG.info("Creating branch {} on iceberg table {}.{}", branchName, 
hmsTable.getDbName(),

Review Comment:
   Added `snapshotId `. I think INFO is enough as this DDL is rarely executed.





Issue Time Tracking
---

Worklog Id: (was: 860463)
Time Spent: 6h 40m  (was: 6.5h)

> Iceberg:  CREATE BRANCH SQL implementation
> --
>
> Key: HIVE-27234
> URL: https://issues.apache.org/jira/browse/HIVE-27234
> Project: Hive
>  Issue Type: Sub-task
>  Components: Iceberg integration
>Reporter: zhangbutao
>Assignee: zhangbutao
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h 40m
>  Remaining Estimate: 0h
>
> Maybe we can follow spark sql about branch ddl implementation 
> [https://github.com/apache/iceberg/pull/6617]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-27234) Iceberg: CREATE BRANCH SQL implementation

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27234?focusedWorklogId=860462=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860462
 ]

ASF GitHub Bot logged work on HIVE-27234:
-

Author: ASF GitHub Bot
Created on: 04/May/23 04:47
Start Date: 04/May/23 04:47
Worklog Time Spent: 10m 
  Work Description: zhangbutao commented on code in PR #4216:
URL: https://github.com/apache/hive/pull/4216#discussion_r1184536300


##
ql/src/java/org/apache/hadoop/hive/ql/ddl/table/AlterTableType.java:
##
@@ -40,6 +40,7 @@ public enum AlterTableType {
   ALTERPARTITION("alter partition"), // Note: this is never used in 
AlterTableDesc.
   SETPARTITIONSPEC("set partition spec"),
   EXECUTE("execute"),
+  CREATEBRANCH("create branch"),

Review Comment:
   done





Issue Time Tracking
---

Worklog Id: (was: 860462)
Time Spent: 6.5h  (was: 6h 20m)

> Iceberg:  CREATE BRANCH SQL implementation
> --
>
> Key: HIVE-27234
> URL: https://issues.apache.org/jira/browse/HIVE-27234
> Project: Hive
>  Issue Type: Sub-task
>  Components: Iceberg integration
>Reporter: zhangbutao
>Assignee: zhangbutao
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6.5h
>  Remaining Estimate: 0h
>
> Maybe we can follow spark sql about branch ddl implementation 
> [https://github.com/apache/iceberg/pull/6617]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-27234) Iceberg: CREATE BRANCH SQL implementation

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27234?focusedWorklogId=860461=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860461
 ]

ASF GitHub Bot logged work on HIVE-27234:
-

Author: ASF GitHub Bot
Created on: 04/May/23 04:46
Start Date: 04/May/23 04:46
Worklog Time Spent: 10m 
  Work Description: zhangbutao commented on code in PR #4216:
URL: https://github.com/apache/hive/pull/4216#discussion_r1184536183


##
parser/src/java/org/apache/hadoop/hive/ql/parse/AlterClauseParser.g:
##
@@ -477,6 +478,34 @@ alterStatementSuffixExecute
 -> ^(TOK_ALTERTABLE_EXECUTE KW_SET_CURRENT_SNAPSHOT $snapshotParam)
 ;
 
+alterStatementSuffixCreateBranch
+@init { gParent.pushMsg("alter table create branch", state); }
+@after { gParent.popMsg(state); }
+: KW_CREATE KW_BRANCH branchName=identifier snapshotIdOfBranch? 
branchRetain? retentionOfSnapshots?
+-> ^(TOK_ALTERTABLE_CREATE_BRANCH $branchName snapshotIdOfBranch? 
branchRetain? retentionOfSnapshots?)
+;
+
+snapshotIdOfBranch
+@init { gParent.pushMsg("alter table create branch as of version", state); }

Review Comment:
   done



##
parser/src/java/org/apache/hadoop/hive/ql/parse/AlterClauseParser.g:
##
@@ -477,6 +478,34 @@ alterStatementSuffixExecute
 -> ^(TOK_ALTERTABLE_EXECUTE KW_SET_CURRENT_SNAPSHOT $snapshotParam)
 ;
 
+alterStatementSuffixCreateBranch
+@init { gParent.pushMsg("alter table create branch", state); }
+@after { gParent.popMsg(state); }
+: KW_CREATE KW_BRANCH branchName=identifier snapshotIdOfBranch? 
branchRetain? retentionOfSnapshots?
+-> ^(TOK_ALTERTABLE_CREATE_BRANCH $branchName snapshotIdOfBranch? 
branchRetain? retentionOfSnapshots?)
+;
+
+snapshotIdOfBranch
+@init { gParent.pushMsg("alter table create branch as of version", state); }
+@after { gParent.popMsg(state); }
+: KW_AS KW_OF KW_VERSION snapshotId=Number
+-> ^(TOK_AS_OF_VERSION_BRANCH $snapshotId)

Review Comment:
   done





Issue Time Tracking
---

Worklog Id: (was: 860461)
Time Spent: 6h 20m  (was: 6h 10m)

> Iceberg:  CREATE BRANCH SQL implementation
> --
>
> Key: HIVE-27234
> URL: https://issues.apache.org/jira/browse/HIVE-27234
> Project: Hive
>  Issue Type: Sub-task
>  Components: Iceberg integration
>Reporter: zhangbutao
>Assignee: zhangbutao
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>
> Maybe we can follow spark sql about branch ddl implementation 
> [https://github.com/apache/iceberg/pull/6617]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-24515) Analyze table job can be skipped when stats populated are already accurate

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24515?focusedWorklogId=860443=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860443
 ]

ASF GitHub Bot logged work on HIVE-24515:
-

Author: ASF GitHub Bot
Created on: 03/May/23 22:44
Start Date: 03/May/23 22:44
Worklog Time Spent: 10m 
  Work Description: sonarcloud[bot] commented on PR #4288:
URL: https://github.com/apache/hive/pull/4288#issuecomment-1533845859

   Kudos, SonarCloud Quality Gate passed!  [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive=4288)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=4288=false=BUG)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4288=false=BUG)
 [0 
Bugs](https://sonarcloud.io/project/issues?id=apache_hive=4288=false=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=4288=false=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4288=false=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=4288=false=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4288=false=SECURITY_HOTSPOT)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4288=false=SECURITY_HOTSPOT)
 [0 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4288=false=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive=4288=false=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4288=false=CODE_SMELL)
 [14 Code 
Smells](https://sonarcloud.io/project/issues?id=apache_hive=4288=false=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4288=coverage=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4288=duplicated_lines_density=list)
 No Duplication information
   
   




Issue Time Tracking
---

Worklog Id: (was: 860443)
Time Spent: 4h  (was: 3h 50m)

> Analyze table job can be skipped when stats populated are already accurate
> --
>
> Key: HIVE-24515
> URL: https://issues.apache.org/jira/browse/HIVE-24515
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Dmitriy Fingerman
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> For non-partitioned tables, stats detail should be present in table level,
> e.g
> {noformat}
> COLUMN_STATS_ACCURATE={"BASIC_STATS":"true","COLUMN_STATS":{"d_current_day":"true"...
>  }}
>   {noformat}
> For partitioned tables, stats detail should be present in partition level,
> {noformat}
> store_sales(ss_sold_date_sk=2451819)
> {totalSize=0, numRows=0, rawDataSize=0, 
> COLUMN_STATS_ACCURATE={"BASIC_STATS":"true","COLUMN_STATS":{"ss_addr_sk":"true"}}
>  
>  {noformat}
> When stats populated are already accurate, {{analyze table tn compute 
> statistics for columns}} should skip launching the job.
>  
> For ACID tables, stats are auto computed and it can skip computing stats 
> again when stats are accurate.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (HIVE-24289) RetryingMetaStoreClient should not retry connecting to HMS on genuine errors

2023-05-03 Thread Dmitriy Fingerman (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy Fingerman resolved HIVE-24289.
--
Resolution: Duplicate

> RetryingMetaStoreClient should not retry connecting to HMS on genuine errors
> 
>
> Key: HIVE-24289
> URL: https://issues.apache.org/jira/browse/HIVE-24289
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Dmitriy Fingerman
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> When there is genuine error from HMS, it should not be retried in 
> RetryingMetaStoreClient. 
> For e.g, following query would be retried multiple times (~20+ times) in HMS 
> causing huge delay in processing, even though this constraint is available in 
> HMS. 
> It should just throw exception to client and stop retrying in such cases.
> {noformat}
> alter table web_sales add constraint tpcds_bin_partitioned_orc_1_ws_s_hd 
> foreign key  (ws_ship_hdemo_sk) references household_demographics 
> (hd_demo_sk) disable novalidate rely;
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.thrift.TApplicationException: Internal error processing 
> add_foreign_key
>   at org.apache.hadoop.hive.ql.metadata.Hive.addForeignKey(Hive.java:5914)
> ..
> ...
> Caused by: org.apache.thrift.TApplicationException: Internal error processing 
> add_foreign_key
>at 
> org.apache.thrift.TApplicationException.read(TApplicationException.java:111)
>at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:79)
>at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_add_foreign_key(ThriftHiveMetastore.java:1872)
> {noformat}
> https://github.com/apache/hive/blob/master/standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/RetryingMetaStoreClient.java#L256
> For e.g, if exception contains "Internal error processing ", it could stop 
> retrying all over again.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work started] (HIVE-24289) RetryingMetaStoreClient should not retry connecting to HMS on genuine errors

2023-05-03 Thread Dmitriy Fingerman (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-24289 started by Dmitriy Fingerman.

> RetryingMetaStoreClient should not retry connecting to HMS on genuine errors
> 
>
> Key: HIVE-24289
> URL: https://issues.apache.org/jira/browse/HIVE-24289
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Dmitriy Fingerman
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> When there is genuine error from HMS, it should not be retried in 
> RetryingMetaStoreClient. 
> For e.g, following query would be retried multiple times (~20+ times) in HMS 
> causing huge delay in processing, even though this constraint is available in 
> HMS. 
> It should just throw exception to client and stop retrying in such cases.
> {noformat}
> alter table web_sales add constraint tpcds_bin_partitioned_orc_1_ws_s_hd 
> foreign key  (ws_ship_hdemo_sk) references household_demographics 
> (hd_demo_sk) disable novalidate rely;
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.thrift.TApplicationException: Internal error processing 
> add_foreign_key
>   at org.apache.hadoop.hive.ql.metadata.Hive.addForeignKey(Hive.java:5914)
> ..
> ...
> Caused by: org.apache.thrift.TApplicationException: Internal error processing 
> add_foreign_key
>at 
> org.apache.thrift.TApplicationException.read(TApplicationException.java:111)
>at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:79)
>at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_add_foreign_key(ThriftHiveMetastore.java:1872)
> {noformat}
> https://github.com/apache/hive/blob/master/standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/RetryingMetaStoreClient.java#L256
> For e.g, if exception contains "Internal error processing ", it could stop 
> retrying all over again.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (HIVE-24289) RetryingMetaStoreClient should not retry connecting to HMS on genuine errors

2023-05-03 Thread Dmitriy Fingerman (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy Fingerman reassigned HIVE-24289:


Assignee: Dmitriy Fingerman  (was: Harshit Gupta)

> RetryingMetaStoreClient should not retry connecting to HMS on genuine errors
> 
>
> Key: HIVE-24289
> URL: https://issues.apache.org/jira/browse/HIVE-24289
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Dmitriy Fingerman
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> When there is genuine error from HMS, it should not be retried in 
> RetryingMetaStoreClient. 
> For e.g, following query would be retried multiple times (~20+ times) in HMS 
> causing huge delay in processing, even though this constraint is available in 
> HMS. 
> It should just throw exception to client and stop retrying in such cases.
> {noformat}
> alter table web_sales add constraint tpcds_bin_partitioned_orc_1_ws_s_hd 
> foreign key  (ws_ship_hdemo_sk) references household_demographics 
> (hd_demo_sk) disable novalidate rely;
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.thrift.TApplicationException: Internal error processing 
> add_foreign_key
>   at org.apache.hadoop.hive.ql.metadata.Hive.addForeignKey(Hive.java:5914)
> ..
> ...
> Caused by: org.apache.thrift.TApplicationException: Internal error processing 
> add_foreign_key
>at 
> org.apache.thrift.TApplicationException.read(TApplicationException.java:111)
>at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:79)
>at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_add_foreign_key(ThriftHiveMetastore.java:1872)
> {noformat}
> https://github.com/apache/hive/blob/master/standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/RetryingMetaStoreClient.java#L256
> For e.g, if exception contains "Internal error processing ", it could stop 
> retrying all over again.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (HIVE-27281) Add ability of masking to Beeline q-tests

2023-05-03 Thread Dmitriy Fingerman (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy Fingerman resolved HIVE-27281.
--
Resolution: Fixed

> Add ability of masking to Beeline q-tests
> -
>
> Key: HIVE-27281
> URL: https://issues.apache.org/jira/browse/HIVE-27281
> Project: Hive
>  Issue Type: Improvement
>Reporter: Dmitriy Fingerman
>Assignee: Dmitriy Fingerman
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-27311) Improve LDAP auth to support generic search bind authentication

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27311?focusedWorklogId=860435=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860435
 ]

ASF GitHub Bot logged work on HIVE-27311:
-

Author: ASF GitHub Bot
Created on: 03/May/23 20:08
Start Date: 03/May/23 20:08
Worklog Time Spent: 10m 
  Work Description: sonarcloud[bot] commented on PR #4284:
URL: https://github.com/apache/hive/pull/4284#issuecomment-1533666733

   Kudos, SonarCloud Quality Gate passed!  [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive=4284)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=4284=false=BUG)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4284=false=BUG)
 [0 
Bugs](https://sonarcloud.io/project/issues?id=apache_hive=4284=false=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=4284=false=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4284=false=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=4284=false=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4284=false=SECURITY_HOTSPOT)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4284=false=SECURITY_HOTSPOT)
 [0 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4284=false=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive=4284=false=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4284=false=CODE_SMELL)
 [7 Code 
Smells](https://sonarcloud.io/project/issues?id=apache_hive=4284=false=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4284=coverage=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4284=duplicated_lines_density=list)
 No Duplication information
   
   




Issue Time Tracking
---

Worklog Id: (was: 860435)
Time Spent: 1h  (was: 50m)

> Improve LDAP auth to support generic search bind authentication
> ---
>
> Key: HIVE-27311
> URL: https://issues.apache.org/jira/browse/HIVE-27311
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 4.0.0-alpha-2
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Hive's LDAP auth configuration is home-baked and a bit specific to hive. This 
> was by design intending to be as flexible as it can be for accommodating 
> various LDAP implementations. But this does not necessarily make it easy to 
> configure hive with such custom values for ldap filtering when most other 
> components accept generic ldap filters, for example: search bind filters.
> There has to be a layer of translation to have it configured. Instead we can 
> enhance Hive to support generic search bind filters.
> To support this, I am proposing adding NEW alternate configurations. 
> hive.server2.authentication.ldap.userSearchFilter
> hive.server2.authentication.ldap.groupSearchFilter
> hive.server2.authentication.ldap.groupBaseDN
> Search bind filtering will also use EXISTING config param
> hive.server2.authentication.ldap.baseDN
> This is alternate configuration and will be used first if

[jira] [Work logged] (HIVE-27311) Improve LDAP auth to support generic search bind authentication

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27311?focusedWorklogId=860434=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860434
 ]

ASF GitHub Bot logged work on HIVE-27311:
-

Author: ASF GitHub Bot
Created on: 03/May/23 19:41
Start Date: 03/May/23 19:41
Worklog Time Spent: 10m 
  Work Description: nrg4878 commented on code in PR #4284:
URL: https://github.com/apache/hive/pull/4284#discussion_r1184180338


##
service/src/java/org/apache/hive/service/auth/ldap/DirSearch.java:
##
@@ -34,6 +34,16 @@ public interface DirSearch extends Closeable {
*/
   String findUserDn(String user) throws NamingException;
 
+  /**
+   * Finds user's distinguished name.
+   * @param user username
+   * @param userSearchFilter Generic LDAP Search filter for ex: 
(&(uid={0})(objectClass=person))
+   * @param baseDn LDAP BaseDN for user searches for ex: dc=apache,dc=org
+   * @return DN for the specific user if exists, null otherwise
+   * @throws NamingException
+   */
+  String findUserDnBySearch(String user, String userSearchFilter, String 
baseDn) throws NamingException;

Review Comment:
   yeah, this entire code was replicated for supporting ldap auth for HMS. I 
think it would make sense to make changes to the HMS provider as well. I wasnt 
sure how to test it manually though. Will give it a try otherwise may have to 
fork the work for another jira.
   
   It is possible to lump them both into single method. I kept them separate 
for a couple reasons. findUserDn() and findUserDnBySearch() use different 
criteria/configuration to find the userDN from a given username.  This requires 
a change to the interface method though, which I wasn't very fond of. This also 
kept the methods separate based on the factory that was calling it. Less 
intersection with existing code. As this is an alternate configuration for LDAP.
   if you feel strongly about merging them, I can give it a shot.





Issue Time Tracking
---

Worklog Id: (was: 860434)
Time Spent: 50m  (was: 40m)

> Improve LDAP auth to support generic search bind authentication
> ---
>
> Key: HIVE-27311
> URL: https://issues.apache.org/jira/browse/HIVE-27311
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 4.0.0-alpha-2
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Hive's LDAP auth configuration is home-baked and a bit specific to hive. This 
> was by design intending to be as flexible as it can be for accommodating 
> various LDAP implementations. But this does not necessarily make it easy to 
> configure hive with such custom values for ldap filtering when most other 
> components accept generic ldap filters, for example: search bind filters.
> There has to be a layer of translation to have it configured. Instead we can 
> enhance Hive to support generic search bind filters.
> To support this, I am proposing adding NEW alternate configurations. 
> hive.server2.authentication.ldap.userSearchFilter
> hive.server2.authentication.ldap.groupSearchFilter
> hive.server2.authentication.ldap.groupBaseDN
> Search bind filtering will also use EXISTING config param
> hive.server2.authentication.ldap.baseDN
> This is alternate configuration and will be used first if specified. So users 
> can continue to use existing configuration as well. These changes should not 
> interfere with existing configurations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-27307) NPE when generating incremental rebuild plan of materialized view with empty Iceberg source table

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27307?focusedWorklogId=860422=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860422
 ]

ASF GitHub Bot logged work on HIVE-27307:
-

Author: ASF GitHub Bot
Created on: 03/May/23 19:17
Start Date: 03/May/23 19:17
Worklog Time Spent: 10m 
  Work Description: sonarcloud[bot] commented on PR #4279:
URL: https://github.com/apache/hive/pull/4279#issuecomment-1533570385

   Kudos, SonarCloud Quality Gate passed!  [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive=4279)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=4279=false=BUG)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4279=false=BUG)
 [0 
Bugs](https://sonarcloud.io/project/issues?id=apache_hive=4279=false=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=4279=false=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4279=false=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=4279=false=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4279=false=SECURITY_HOTSPOT)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4279=false=SECURITY_HOTSPOT)
 [0 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4279=false=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive=4279=false=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4279=false=CODE_SMELL)
 [0 Code 
Smells](https://sonarcloud.io/project/issues?id=apache_hive=4279=false=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4279=coverage=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4279=duplicated_lines_density=list)
 No Duplication information
   
   




Issue Time Tracking
---

Worklog Id: (was: 860422)
Time Spent: 50m  (was: 40m)

> NPE when generating incremental rebuild plan of materialized view with empty 
> Iceberg source table
> -
>
> Key: HIVE-27307
> URL: https://issues.apache.org/jira/browse/HIVE-27307
> Project: Hive
>  Issue Type: Bug
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> {code}
> set hive.support.concurrency=true;
> set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
> create external table tbl_ice(a int, b string, c int) stored by iceberg 
> stored as orc tblproperties ('format-version'='1');
> create external table tbl_ice_v2(d int, e string, f int) stored by iceberg 
> stored as orc tblproperties ('format-version'='2');
> insert into tbl_ice_v2 values (1, 'one v2', 50), (4, 'four v2', 53), (5, 
> 'five v2', 54);
> create materialized view mat1 as
> select tbl_ice.b, tbl_ice.c, tbl_ice_v2.e from tbl_ice join tbl_ice_v2 on 
> tbl_ice.a=tbl_ice_v2.d where tbl_ice.c > 52;
> -- insert some new values to one of the source tables
> insert into tbl_ice values (1, 'one', 50), (2, 'two', 51), (3, 'three', 52), 
> (4, 'four', 53), (5, 'five', 54);
> alter materialized view mat1 rebuild;
> {code}
> {code}
> 2023-04-28T07:34:17,949  WARN

[jira] [Work logged] (HIVE-24515) Analyze table job can be skipped when stats populated are already accurate

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24515?focusedWorklogId=860421=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860421
 ]

ASF GitHub Bot logged work on HIVE-24515:
-

Author: ASF GitHub Bot
Created on: 03/May/23 19:05
Start Date: 03/May/23 19:05
Worklog Time Spent: 10m 
  Work Description: difin opened a new pull request, #4288:
URL: https://github.com/apache/hive/pull/4288

   …e already accurate.
   
   
   
   ### What changes were proposed in this pull request?
   
   Skipping analyze table job when stats populated are already accurate.
   
   ### Why are the changes needed?
   
   Performance improvement, skipping stats calculation when they are already 
accurate.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No
   
   ### How was this patch tested?
   
   Unit tests.




Issue Time Tracking
---

Worklog Id: (was: 860421)
Time Spent: 3h 50m  (was: 3h 40m)

> Analyze table job can be skipped when stats populated are already accurate
> --
>
> Key: HIVE-24515
> URL: https://issues.apache.org/jira/browse/HIVE-24515
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Dmitriy Fingerman
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> For non-partitioned tables, stats detail should be present in table level,
> e.g
> {noformat}
> COLUMN_STATS_ACCURATE={"BASIC_STATS":"true","COLUMN_STATS":{"d_current_day":"true"...
>  }}
>   {noformat}
> For partitioned tables, stats detail should be present in partition level,
> {noformat}
> store_sales(ss_sold_date_sk=2451819)
> {totalSize=0, numRows=0, rawDataSize=0, 
> COLUMN_STATS_ACCURATE={"BASIC_STATS":"true","COLUMN_STATS":{"ss_addr_sk":"true"}}
>  
>  {noformat}
> When stats populated are already accurate, {{analyze table tn compute 
> statistics for columns}} should skip launching the job.
>  
> For ACID tables, stats are auto computed and it can skip computing stats 
> again when stats are accurate.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (HIVE-24515) Analyze table job can be skipped when stats populated are already accurate

2023-05-03 Thread Dmitriy Fingerman (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy Fingerman reassigned HIVE-24515:


Assignee: Dmitriy Fingerman  (was: mahesh kumar behera)

> Analyze table job can be skipped when stats populated are already accurate
> --
>
> Key: HIVE-24515
> URL: https://issues.apache.org/jira/browse/HIVE-24515
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Dmitriy Fingerman
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> For non-partitioned tables, stats detail should be present in table level,
> e.g
> {noformat}
> COLUMN_STATS_ACCURATE={"BASIC_STATS":"true","COLUMN_STATS":{"d_current_day":"true"...
>  }}
>   {noformat}
> For partitioned tables, stats detail should be present in partition level,
> {noformat}
> store_sales(ss_sold_date_sk=2451819)
> {totalSize=0, numRows=0, rawDataSize=0, 
> COLUMN_STATS_ACCURATE={"BASIC_STATS":"true","COLUMN_STATS":{"ss_addr_sk":"true"}}
>  
>  {noformat}
> When stats populated are already accurate, {{analyze table tn compute 
> statistics for columns}} should skip launching the job.
>  
> For ACID tables, stats are auto computed and it can skip computing stats 
> again when stats are accurate.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-27284) Make HMSHandler proxy pluggable

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27284?focusedWorklogId=860420=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860420
 ]

ASF GitHub Bot logged work on HIVE-27284:
-

Author: ASF GitHub Bot
Created on: 03/May/23 18:51
Start Date: 03/May/23 18:51
Worklog Time Spent: 10m 
  Work Description: sonarcloud[bot] commented on PR #4257:
URL: https://github.com/apache/hive/pull/4257#issuecomment-1533538693

   Kudos, SonarCloud Quality Gate passed!  [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive=4257)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=4257=false=BUG)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4257=false=BUG)
 [0 
Bugs](https://sonarcloud.io/project/issues?id=apache_hive=4257=false=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=4257=false=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4257=false=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=4257=false=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4257=false=SECURITY_HOTSPOT)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4257=false=SECURITY_HOTSPOT)
 [0 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4257=false=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive=4257=false=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4257=false=CODE_SMELL)
 [14 Code 
Smells](https://sonarcloud.io/project/issues?id=apache_hive=4257=false=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4257=coverage=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4257=duplicated_lines_density=list)
 No Duplication information
   
   




Issue Time Tracking
---

Worklog Id: (was: 860420)
Time Spent: 1h  (was: 50m)

> Make HMSHandler proxy pluggable
> ---
>
> Key: HIVE-27284
> URL: https://issues.apache.org/jira/browse/HIVE-27284
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 4.0.0-alpha-2
>Reporter: Wechar
>Assignee: Wechar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Currently HMS use the only proxy implementation of HMSHandler i.e 
> {{RetryingHMSHandler}}, resulting in some code hacks in {{HMSHandler}}. For 
> example when test HMS timeout, we add additional static fields 
> {{testTimeoutEnabled}} and {{testTimeoutValue}}, and add sleep code in 
> {{create_database}} method, which is not elegant and flexible.
> So we introduce a new conf {{metastore.hmshandler.proxy}} to configure proxy 
> class for HMSHandler, it will be more convenient to extend the new proxy of 
> HMSHandler and can separate test code from HMSHandler.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-27314) Backport of HIVE-25600: Compaction job creates redundant base/delta folder within base/delta folder

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-27314:
--
Labels: pull-request-available  (was: )

> Backport of HIVE-25600: Compaction job creates redundant base/delta folder 
> within base/delta folder
> ---
>
> Key: HIVE-27314
> URL: https://issues.apache.org/jira/browse/HIVE-27314
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Diksha
>Assignee: Diksha
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Backport of HIVE-25600: Compaction job creates redundant base/delta folder 
> within base/delta folder



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-27314) Backport of HIVE-25600: Compaction job creates redundant base/delta folder within base/delta folder

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27314?focusedWorklogId=860410=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860410
 ]

ASF GitHub Bot logged work on HIVE-27314:
-

Author: ASF GitHub Bot
Created on: 03/May/23 17:15
Start Date: 03/May/23 17:15
Worklog Time Spent: 10m 
  Work Description: Diksha628 opened a new pull request, #4287:
URL: https://github.com/apache/hive/pull/4287

   … base/delta folder (Nikhil Gupta, reviewed by Sankar Hariappan)
   
   Signed-off-by: Sankar Hariappan 
   Closes (#2705)
   
   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   




Issue Time Tracking
---

Worklog Id: (was: 860410)
Remaining Estimate: 0h
Time Spent: 10m

> Backport of HIVE-25600: Compaction job creates redundant base/delta folder 
> within base/delta folder
> ---
>
> Key: HIVE-27314
> URL: https://issues.apache.org/jira/browse/HIVE-27314
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Diksha
>Assignee: Diksha
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Backport of HIVE-25600: Compaction job creates redundant base/delta folder 
> within base/delta folder



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-27313) Backport of HIVE-24324: Remove deprecated API usage from Avro

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27313?focusedWorklogId=860406=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860406
 ]

ASF GitHub Bot logged work on HIVE-27313:
-

Author: ASF GitHub Bot
Created on: 03/May/23 17:08
Start Date: 03/May/23 17:08
Worklog Time Spent: 10m 
  Work Description: Diksha628 opened a new pull request, #4286:
URL: https://github.com/apache/hive/pull/4286

   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   




Issue Time Tracking
---

Worklog Id: (was: 860406)
Remaining Estimate: 0h
Time Spent: 10m

> Backport of HIVE-24324: Remove deprecated API usage from Avro
> -
>
> Key: HIVE-27313
> URL: https://issues.apache.org/jira/browse/HIVE-27313
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Diksha
>Assignee: Diksha
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Backport of HIVE-24324: Remove deprecated API usage from Avro



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-27313) Backport of HIVE-24324: Remove deprecated API usage from Avro

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-27313:
--
Labels: pull-request-available  (was: )

> Backport of HIVE-24324: Remove deprecated API usage from Avro
> -
>
> Key: HIVE-27313
> URL: https://issues.apache.org/jira/browse/HIVE-27313
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Diksha
>Assignee: Diksha
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Backport of HIVE-24324: Remove deprecated API usage from Avro



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-27311) Improve LDAP auth to support generic search bind authentication

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27311?focusedWorklogId=860404=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860404
 ]

ASF GitHub Bot logged work on HIVE-27311:
-

Author: ASF GitHub Bot
Created on: 03/May/23 17:07
Start Date: 03/May/23 17:07
Worklog Time Spent: 10m 
  Work Description: saihemanth-cloudera commented on code in PR #4284:
URL: https://github.com/apache/hive/pull/4284#discussion_r1183976313


##
service/src/java/org/apache/hive/service/auth/ldap/DirSearch.java:
##
@@ -34,6 +34,16 @@ public interface DirSearch extends Closeable {
*/
   String findUserDn(String user) throws NamingException;
 
+  /**
+   * Finds user's distinguished name.
+   * @param user username
+   * @param userSearchFilter Generic LDAP Search filter for ex: 
(&(uid={0})(objectClass=person))
+   * @param baseDn LDAP BaseDN for user searches for ex: dc=apache,dc=org
+   * @return DN for the specific user if exists, null otherwise
+   * @throws NamingException
+   */
+  String findUserDnBySearch(String user, String userSearchFilter, String 
baseDn) throws NamingException;

Review Comment:
   There is a `DirSearch` class in the metastore package also. Should we 
introduce a similar change there also to keep the consistency? Not sure if LDAP 
search in metastore is used at all. 





Issue Time Tracking
---

Worklog Id: (was: 860404)
Time Spent: 40m  (was: 0.5h)

> Improve LDAP auth to support generic search bind authentication
> ---
>
> Key: HIVE-27311
> URL: https://issues.apache.org/jira/browse/HIVE-27311
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 4.0.0-alpha-2
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Hive's LDAP auth configuration is home-baked and a bit specific to hive. This 
> was by design intending to be as flexible as it can be for accommodating 
> various LDAP implementations. But this does not necessarily make it easy to 
> configure hive with such custom values for ldap filtering when most other 
> components accept generic ldap filters, for example: search bind filters.
> There has to be a layer of translation to have it configured. Instead we can 
> enhance Hive to support generic search bind filters.
> To support this, I am proposing adding NEW alternate configurations. 
> hive.server2.authentication.ldap.userSearchFilter
> hive.server2.authentication.ldap.groupSearchFilter
> hive.server2.authentication.ldap.groupBaseDN
> Search bind filtering will also use EXISTING config param
> hive.server2.authentication.ldap.baseDN
> This is alternate configuration and will be used first if specified. So users 
> can continue to use existing configuration as well. These changes should not 
> interfere with existing configurations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-27312) Backport of HIVE-24965: Describe table partition stats fetch should be configurable

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-27312:
--
Labels: pull-request-available  (was: )

> Backport of HIVE-24965: Describe table partition stats fetch should be 
> configurable
> ---
>
> Key: HIVE-27312
> URL: https://issues.apache.org/jira/browse/HIVE-27312
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Diksha
>Assignee: Diksha
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Backport of HIVE-24965: Describe table partition stats fetch should be 
> configurable



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-27312) Backport of HIVE-24965: Describe table partition stats fetch should be configurable

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27312?focusedWorklogId=860403=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860403
 ]

ASF GitHub Bot logged work on HIVE-27312:
-

Author: ASF GitHub Bot
Created on: 03/May/23 17:03
Start Date: 03/May/23 17:03
Worklog Time Spent: 10m 
  Work Description: Diksha628 opened a new pull request, #4285:
URL: https://github.com/apache/hive/pull/4285

   …le(Kevin Cheung, reviewed by Sankar Hariappan)
   
   Signed-off-by: Sankar Hariappan 
   Closes (#2157)
   
   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   




Issue Time Tracking
---

Worklog Id: (was: 860403)
Remaining Estimate: 0h
Time Spent: 10m

> Backport of HIVE-24965: Describe table partition stats fetch should be 
> configurable
> ---
>
> Key: HIVE-27312
> URL: https://issues.apache.org/jira/browse/HIVE-27312
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Diksha
>Assignee: Diksha
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Backport of HIVE-24965: Describe table partition stats fetch should be 
> configurable



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-27118) implement array_intersect UDF in Hive

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27118?focusedWorklogId=860400=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860400
 ]

ASF GitHub Bot logged work on HIVE-27118:
-

Author: ASF GitHub Bot
Created on: 03/May/23 17:02
Start Date: 03/May/23 17:02
Worklog Time Spent: 10m 
  Work Description: sonarcloud[bot] commented on PR #4094:
URL: https://github.com/apache/hive/pull/4094#issuecomment-1533397100

   Kudos, SonarCloud Quality Gate passed!  [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive=4094)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=4094=false=BUG)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4094=false=BUG)
 [0 
Bugs](https://sonarcloud.io/project/issues?id=apache_hive=4094=false=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=4094=false=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4094=false=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=4094=false=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4094=false=SECURITY_HOTSPOT)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4094=false=SECURITY_HOTSPOT)
 [0 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4094=false=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive=4094=false=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4094=false=CODE_SMELL)
 [0 Code 
Smells](https://sonarcloud.io/project/issues?id=apache_hive=4094=false=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4094=coverage=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4094=duplicated_lines_density=list)
 No Duplication information
   
   




Issue Time Tracking
---

Worklog Id: (was: 860400)
Time Spent: 1h 10m  (was: 1h)

> implement array_intersect UDF in Hive
> -
>
> Key: HIVE-27118
> URL: https://issues.apache.org/jira/browse/HIVE-27118
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Taraka Rama Rao Lethavadla
>Assignee: Taraka Rama Rao Lethavadla
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> *array_intersect(array1, array2)*
> {{Returns an array of the elements in the intersection of array1}} and 
> {{{}array2{}}}, without duplicates.
>  
> {noformat}
> > SELECT array_intersect(array(1, 2, 2, 3), array(1, 1, 3, 5));
> [1,3]
> {noformat}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-27314) Backport of HIVE-25600: Compaction job creates redundant base/delta folder within base/delta folder

2023-05-03 Thread Diksha (Jira)

Diksha created HIVE-27314:
-

 Summary: Backport of HIVE-25600: Compaction job creates redundant 
base/delta folder within base/delta folder
 Key: HIVE-27314
 URL: https://issues.apache.org/jira/browse/HIVE-27314
 Project: Hive
  Issue Type: Sub-task
Reporter: Diksha
Assignee: Diksha


Backport of HIVE-25600: Compaction job creates redundant base/delta folder 
within base/delta folder



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-27313) Backport of HIVE-24324: Remove deprecated API usage from Avro

2023-05-03 Thread Diksha (Jira)

Diksha created HIVE-27313:
-

 Summary: Backport of HIVE-24324: Remove deprecated API usage from 
Avro
 Key: HIVE-27313
 URL: https://issues.apache.org/jira/browse/HIVE-27313
 Project: Hive
  Issue Type: Sub-task
Reporter: Diksha
Assignee: Diksha


Backport of HIVE-24324: Remove deprecated API usage from Avro



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (HIVE-27312) Backport of HIVE-24965: Describe table partition stats fetch should be configurable

2023-05-03 Thread Diksha (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Diksha reassigned HIVE-27312:
-

Assignee: Diksha

> Backport of HIVE-24965: Describe table partition stats fetch should be 
> configurable
> ---
>
> Key: HIVE-27312
> URL: https://issues.apache.org/jira/browse/HIVE-27312
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Diksha
>Assignee: Diksha
>Priority: Major
>
> Backport of HIVE-24965: Describe table partition stats fetch should be 
> configurable



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-27312) Backport of HIVE-24965: Describe table partition stats fetch should be configurable

2023-05-03 Thread Diksha (Jira)

Diksha created HIVE-27312:
-

 Summary: Backport of HIVE-24965: Describe table partition stats 
fetch should be configurable
 Key: HIVE-27312
 URL: https://issues.apache.org/jira/browse/HIVE-27312
 Project: Hive
  Issue Type: Sub-task
Reporter: Diksha


Backport of HIVE-24965: Describe table partition stats fetch should be 
configurable



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-27284) Make HMSHandler proxy pluggable

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27284?focusedWorklogId=860387=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860387
 ]

ASF GitHub Bot logged work on HIVE-27284:
-

Author: ASF GitHub Bot
Created on: 03/May/23 16:30
Start Date: 03/May/23 16:30
Worklog Time Spent: 10m 
  Work Description: wecharyu commented on PR #4257:
URL: https://github.com/apache/hive/pull/4257#issuecomment-1533349412

   @deniskuzZ @saihemanth-cloudera @veghlaci05: Could you help review this PR?




Issue Time Tracking
---

Worklog Id: (was: 860387)
Time Spent: 50m  (was: 40m)

> Make HMSHandler proxy pluggable
> ---
>
> Key: HIVE-27284
> URL: https://issues.apache.org/jira/browse/HIVE-27284
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 4.0.0-alpha-2
>Reporter: Wechar
>Assignee: Wechar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Currently HMS use the only proxy implementation of HMSHandler i.e 
> {{RetryingHMSHandler}}, resulting in some code hacks in {{HMSHandler}}. For 
> example when test HMS timeout, we add additional static fields 
> {{testTimeoutEnabled}} and {{testTimeoutValue}}, and add sleep code in 
> {{create_database}} method, which is not elegant and flexible.
> So we introduce a new conf {{metastore.hmshandler.proxy}} to configure proxy 
> class for HMSHandler, it will be more convenient to extend the new proxy of 
> HMSHandler and can separate test code from HMSHandler.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-27311) Improve LDAP auth to support generic search bind authentication

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27311?focusedWorklogId=860384=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860384
 ]

ASF GitHub Bot logged work on HIVE-27311:
-

Author: ASF GitHub Bot
Created on: 03/May/23 16:25
Start Date: 03/May/23 16:25
Worklog Time Spent: 10m 
  Work Description: henrib commented on code in PR #4284:
URL: https://github.com/apache/hive/pull/4284#discussion_r1183920316


##
service/src/java/org/apache/hive/service/auth/ldap/DirSearch.java:
##
@@ -34,6 +34,16 @@ public interface DirSearch extends Closeable {
*/
   String findUserDn(String user) throws NamingException;
 
+  /**
+   * Finds user's distinguished name.
+   * @param user username
+   * @param userSearchFilter Generic LDAP Search filter for ex: 
(&(uid={0})(objectClass=person))
+   * @param baseDn LDAP BaseDN for user searches for ex: dc=apache,dc=org
+   * @return DN for the specific user if exists, null otherwise
+   * @throws NamingException
+   */
+  String findUserDnBySearch(String user, String userSearchFilter, String 
baseDn) throws NamingException;

Review Comment:
   Couldn't we reuse the 'findUserDn' method name (ie overload) for these new 
methods?





Issue Time Tracking
---

Worklog Id: (was: 860384)
Time Spent: 0.5h  (was: 20m)

> Improve LDAP auth to support generic search bind authentication
> ---
>
> Key: HIVE-27311
> URL: https://issues.apache.org/jira/browse/HIVE-27311
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 4.0.0-alpha-2
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Hive's LDAP auth configuration is home-baked and a bit specific to hive. This 
> was by design intending to be as flexible as it can be for accommodating 
> various LDAP implementations. But this does not necessarily make it easy to 
> configure hive with such custom values for ldap filtering when most other 
> components accept generic ldap filters, for example: search bind filters.
> There has to be a layer of translation to have it configured. Instead we can 
> enhance Hive to support generic search bind filters.
> To support this, I am proposing adding NEW alternate configurations. 
> hive.server2.authentication.ldap.userSearchFilter
> hive.server2.authentication.ldap.groupSearchFilter
> hive.server2.authentication.ldap.groupBaseDN
> Search bind filtering will also use EXISTING config param
> hive.server2.authentication.ldap.baseDN
> This is alternate configuration and will be used first if specified. So users 
> can continue to use existing configuration as well. These changes should not 
> interfere with existing configurations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-26913) Iceberg: HiveVectorizedReader::parquetRecordReader should reuse footer information

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26913?focusedWorklogId=860377=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860377
 ]

ASF GitHub Bot logged work on HIVE-26913:
-

Author: ASF GitHub Bot
Created on: 03/May/23 15:50
Start Date: 03/May/23 15:50
Worklog Time Spent: 10m 
  Work Description: sonarcloud[bot] commented on PR #4136:
URL: https://github.com/apache/hive/pull/4136#issuecomment-1533283907

   Kudos, SonarCloud Quality Gate passed!  [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive=4136)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=4136=false=BUG)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4136=false=BUG)
 [0 
Bugs](https://sonarcloud.io/project/issues?id=apache_hive=4136=false=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=4136=false=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4136=false=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=4136=false=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4136=false=SECURITY_HOTSPOT)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4136=false=SECURITY_HOTSPOT)
 [0 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4136=false=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive=4136=false=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4136=false=CODE_SMELL)
 [0 Code 
Smells](https://sonarcloud.io/project/issues?id=apache_hive=4136=false=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4136=coverage=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4136=duplicated_lines_density=list)
 No Duplication information
   
   




Issue Time Tracking
---

Worklog Id: (was: 860377)
Time Spent: 1h 20m  (was: 1h 10m)

> Iceberg: HiveVectorizedReader::parquetRecordReader should reuse footer 
> information
> --
>
> Key: HIVE-26913
> URL: https://issues.apache.org/jira/browse/HIVE-26913
> Project: Hive
>  Issue Type: Improvement
>  Components: Iceberg integration
>Reporter: Rajesh Balamohan
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: performance, pull-request-available, stability
> Fix For: 4.0.0
>
> Attachments: Screenshot 2023-01-09 at 4.01.14 PM.png
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> HiveVectorizedReader::parquetRecordReader should reuse details of parquet 
> footer, instead of reading it again.
>  
> It reads parquet footer here:
> [https://github.com/apache/hive/blob/master/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/vector/HiveVectorizedReader.java#L230-L232]
> Again it reads the footer here for constructing vectorized recordreader
> [https://github.com/apache/hive/blob/master/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/vector/HiveVectorizedReader.java#L249]
>  
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/VectorizedParquetInputFormat.java#L50]
>  
> Check the codepath of 
> VectorizedParquetRecordReader::setupMetadataAndParquetSplit
>

[jira] [Work logged] (HIVE-27311) Improve LDAP auth to support generic search bind authentication

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27311?focusedWorklogId=860376=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860376
 ]

ASF GitHub Bot logged work on HIVE-27311:
-

Author: ASF GitHub Bot
Created on: 03/May/23 15:42
Start Date: 03/May/23 15:42
Worklog Time Spent: 10m 
  Work Description: nrg4878 commented on PR #4284:
URL: https://github.com/apache/hive/pull/4284#issuecomment-1533272027

   @henrib Could you please review this change? Thank you in advance




Issue Time Tracking
---

Worklog Id: (was: 860376)
Time Spent: 20m  (was: 10m)

> Improve LDAP auth to support generic search bind authentication
> ---
>
> Key: HIVE-27311
> URL: https://issues.apache.org/jira/browse/HIVE-27311
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 4.0.0-alpha-2
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Hive's LDAP auth configuration is home-baked and a bit specific to hive. This 
> was by design intending to be as flexible as it can be for accommodating 
> various LDAP implementations. But this does not necessarily make it easy to 
> configure hive with such custom values for ldap filtering when most other 
> components accept generic ldap filters, for example: search bind filters.
> There has to be a layer of translation to have it configured. Instead we can 
> enhance Hive to support generic search bind filters.
> To support this, I am proposing adding NEW alternate configurations. 
> hive.server2.authentication.ldap.userSearchFilter
> hive.server2.authentication.ldap.groupSearchFilter
> hive.server2.authentication.ldap.groupBaseDN
> Search bind filtering will also use EXISTING config param
> hive.server2.authentication.ldap.baseDN
> This is alternate configuration and will be used first if specified. So users 
> can continue to use existing configuration as well. These changes should not 
> interfere with existing configurations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-27311) Improve LDAP auth to support generic search bind authentication

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-27311:
--
Labels: pull-request-available  (was: )

> Improve LDAP auth to support generic search bind authentication
> ---
>
> Key: HIVE-27311
> URL: https://issues.apache.org/jira/browse/HIVE-27311
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 4.0.0-alpha-2
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Hive's LDAP auth configuration is home-baked and a bit specific to hive. This 
> was by design intending to be as flexible as it can be for accommodating 
> various LDAP implementations. But this does not necessarily make it easy to 
> configure hive with such custom values for ldap filtering when most other 
> components accept generic ldap filters, for example: search bind filters.
> There has to be a layer of translation to have it configured. Instead we can 
> enhance Hive to support generic search bind filters.
> To support this, I am proposing adding NEW alternate configurations. 
> hive.server2.authentication.ldap.userSearchFilter
> hive.server2.authentication.ldap.groupSearchFilter
> hive.server2.authentication.ldap.groupBaseDN
> Search bind filtering will also use EXISTING config param
> hive.server2.authentication.ldap.baseDN
> This is alternate configuration and will be used first if specified. So users 
> can continue to use existing configuration as well. These changes should not 
> interfere with existing configurations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-27311) Improve LDAP auth to support generic search bind authentication

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27311?focusedWorklogId=860374=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860374
 ]

ASF GitHub Bot logged work on HIVE-27311:
-

Author: ASF GitHub Bot
Created on: 03/May/23 15:40
Start Date: 03/May/23 15:40
Worklog Time Spent: 10m 
  Work Description: nrg4878 opened a new pull request, #4284:
URL: https://github.com/apache/hive/pull/4284

   … Gangam)
   
   
   ### What changes were proposed in this pull request?
   Support for generic LDAP search bind authentication with user and group 
filtering.
   For user filtering, use these configurations
   hive.server2.authentication.ldap.baseDN
   hive.server2.authentication.ldap.userSearchFilter
   
   For group filtering (in conjunction with the user filtering)
   hive.server2.authentication.ldap.groupBaseDN
   hive.server2.authentication.ldap.groupSearchFilter
   
   For example:
   user search filter: (&(uid={0})(objectClass=person))
   baseDN: ou=Users,dc=apache,dc=org
   group search filter: 
(&(|(memberUid={0})(memberUid={1}))(objectClass=posixGroup))
   groupBaseDN: ou=Groups,dc=apache,dc=org
   
   In this case, {0} in user filter is the username to be authenticated. user 
search is performed to find the userDN which then is substituted into the group 
search filter to perform a search. If the result set is non-empty, the user is 
assumed to have satisfied the criteria and the auth succeeds. 
   
   Group filter configuration is optional above. In such cases, only a user 
search is performed is success is based on finding the user. 
   
   ### Why are the changes needed?
   Enabling generic ldap configuration for Hive LDAP authentication
   
   ### Does this PR introduce _any_ user-facing change?
   Backward compatible.
   
   ### How was this patch tested?
   Manually using OpenLDAP server
   Unit Tests that use Apache Directory Services LDAP server




Issue Time Tracking
---

Worklog Id: (was: 860374)
Remaining Estimate: 0h
Time Spent: 10m

> Improve LDAP auth to support generic search bind authentication
> ---
>
> Key: HIVE-27311
> URL: https://issues.apache.org/jira/browse/HIVE-27311
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 4.0.0-alpha-2
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Hive's LDAP auth configuration is home-baked and a bit specific to hive. This 
> was by design intending to be as flexible as it can be for accommodating 
> various LDAP implementations. But this does not necessarily make it easy to 
> configure hive with such custom values for ldap filtering when most other 
> components accept generic ldap filters, for example: search bind filters.
> There has to be a layer of translation to have it configured. Instead we can 
> enhance Hive to support generic search bind filters.
> To support this, I am proposing adding NEW alternate configurations. 
> hive.server2.authentication.ldap.userSearchFilter
> hive.server2.authentication.ldap.groupSearchFilter
> hive.server2.authentication.ldap.groupBaseDN
> Search bind filtering will also use EXISTING config param
> hive.server2.authentication.ldap.baseDN
> This is alternate configuration and will be used first if specified. So users 
> can continue to use existing configuration as well. These changes should not 
> interfere with existing configurations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-27032) Introduce liquibase for HMS schema evolution

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27032?focusedWorklogId=860368=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860368
 ]

ASF GitHub Bot logged work on HIVE-27032:
-

Author: ASF GitHub Bot
Created on: 03/May/23 15:15
Start Date: 03/May/23 15:15
Worklog Time Spent: 10m 
  Work Description: sonarcloud[bot] commented on PR #4060:
URL: https://github.com/apache/hive/pull/4060#issuecomment-1533225549

   Kudos, SonarCloud Quality Gate passed!  [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive=4060)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=4060=false=BUG)
 
[![C](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/C-16px.png
 
'C')](https://sonarcloud.io/project/issues?id=apache_hive=4060=false=BUG)
 [1 
Bug](https://sonarcloud.io/project/issues?id=apache_hive=4060=false=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=4060=false=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4060=false=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=4060=false=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4060=false=SECURITY_HOTSPOT)
 
[![E](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/E-16px.png
 
'E')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4060=false=SECURITY_HOTSPOT)
 [6 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4060=false=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive=4060=false=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4060=false=CODE_SMELL)
 [207 Code 
Smells](https://sonarcloud.io/project/issues?id=apache_hive=4060=false=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4060=coverage=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4060=duplicated_lines_density=list)
 No Duplication information
   
   




Issue Time Tracking
---

Worklog Id: (was: 860368)
Time Spent: 4h 20m  (was: 4h 10m)

> Introduce liquibase for HMS schema evolution
> 
>
> Key: HIVE-27032
> URL: https://issues.apache.org/jira/browse/HIVE-27032
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Végh
>Assignee: László Végh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> Introduce liquibase, and replace current upgrade procedure with it.
> The Schematool CLI API should remain untouched, while under the hood, 
> liquibase should be used for HMS schema evolution.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (HIVE-27305) AssertionError in Calcite during planning for incremental rebuild of materialized view with aggregate on decimal column

2023-05-03 Thread Krisztian Kasa (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa resolved HIVE-27305.
---
Resolution: Fixed

Merged to master! Thanks [~lvegh] and [~amansinha100] for review!

> AssertionError in Calcite during planning for incremental rebuild of 
> materialized view with aggregate on decimal column
> ---
>
> Key: HIVE-27305
> URL: https://issues.apache.org/jira/browse/HIVE-27305
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Materialized views
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> {code}
> set hive.support.concurrency=true;
> set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
> set hive.materializedview.rewriting.sql=false;
> create table t1(a int, b decimal(7,2)) stored as orc TBLPROPERTIES 
> ('transactional'='true');
> insert into t1(a, b) values(1, 1);
> create materialized view mat1 stored as orc TBLPROPERTIES 
> ('transactional'='true') as
> select t1.a, sum(t1.b) from t1
> group by t1.a;
> insert into t1(a,b) values(2, 5);
> explain cbo alter materialized view mat1 rebuild;
> {code}
> {code}
> java.lang.AssertionError: 
> Cannot add expression of different type to set:
> set type is RecordType(INTEGER $f0, DECIMAL(17, 2) $f1) NOT NULL
> expression type is RecordType(INTEGER $f0, DECIMAL(18, 2) $f1) NOT NULL
> set is 
> rel#388:HiveAggregate.HIVE.[].any(input=HepRelVertex#387,group={0},agg#0=sum($1))
> expression is HiveProject($f0=[$3], $f1=[CASE(IS NULL($1), $4, IS NULL($4), 
> $1, +($4, $1))])
>   HiveFilter(condition=[OR($2, IS NULL($2))])
> HiveJoin(condition=[IS NOT DISTINCT FROM($0, $3)], joinType=[right], 
> algorithm=[none], cost=[not available])
>   HiveProject(a=[$0], _c1=[$1], $f2=[true])
> HiveTableScan(table=[[default, mat1]], table:alias=[default.mat1])
>   HiveAggregate(group=[{0}], agg#0=[sum($1)])
> HiveProject($f0=[$0], $f1=[$1])
>   HiveFilter(condition=[<(1, $4.writeid)])
> HiveTableScan(table=[[default, t1]], table:alias=[t1])
>   at 
> org.apache.calcite.plan.RelOptUtil.verifyTypeEquivalence(RelOptUtil.java:380)
>   at 
> org.apache.calcite.plan.hep.HepRuleCall.transformTo(HepRuleCall.java:58)
>   at 
> org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:268)
>   at 
> org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:283)
>   at 
> org.apache.hadoop.hive.ql.optimizer.calcite.rules.views.HiveAggregateIncrementalRewritingRuleBase.onMatch(HiveAggregateIncrementalRewritingRuleBase.java:161)
>   at 
> org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:333)
>   at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:542)
>   at 
> org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:407)
>   at 
> org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:243)
>   at 
> org.apache.calcite.plan.hep.HepInstruction$RuleInstance.execute(HepInstruction.java:127)
>   at 
> org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:202)
>   at 
> org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:189)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.executeProgram(CalcitePlanner.java:2468)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.executeProgram(CalcitePlanner.java:2427)
>   at 
> org.apache.hadoop.hive.ql.ddl.view.materialized.alter.rebuild.AlterMaterializedViewRebuildAnalyzer$MVRebuildCalcitePlannerAction.applyIncrementalRebuild(AlterMaterializedViewRebuildAnalyzer.java:460)
>   at 
> org.apache.hadoop.hive.ql.ddl.view.materialized.alter.rebuild.AlterMaterializedViewRebuildAnalyzer$MVRebuildCalcitePlannerAction.applyAggregateInsertIncremental(AlterMaterializedViewRebuildAnalyzer.java:352)
>   at 
> org.apache.hadoop.hive.ql.ddl.view.materialized.alter.rebuild.AlterMaterializedViewRebuildAnalyzer$MVRebuildCalcitePlannerAction.applyRecordIncrementalRebuildPlan(AlterMaterializedViewRebuildAnalyzer.java:311)
>   at 
> org.apache.hadoop.hive.ql.ddl.view.materialized.alter.rebuild.AlterMaterializedViewRebuildAnalyzer$MVRebuildCalcitePlannerAction.applyMaterializedViewRewriting(AlterMaterializedViewRebuildAnalyzer.java:278)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1722)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1591)
>   at 
>

[jira] [Work logged] (HIVE-27305) AssertionError in Calcite during planning for incremental rebuild of materialized view with aggregate on decimal column

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27305?focusedWorklogId=860365=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860365
 ]

ASF GitHub Bot logged work on HIVE-27305:
-

Author: ASF GitHub Bot
Created on: 03/May/23 15:03
Start Date: 03/May/23 15:03
Worklog Time Spent: 10m 
  Work Description: kasakrisz merged PR #4277:
URL: https://github.com/apache/hive/pull/4277




Issue Time Tracking
---

Worklog Id: (was: 860365)
Time Spent: 1h  (was: 50m)

> AssertionError in Calcite during planning for incremental rebuild of 
> materialized view with aggregate on decimal column
> ---
>
> Key: HIVE-27305
> URL: https://issues.apache.org/jira/browse/HIVE-27305
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Materialized views
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> {code}
> set hive.support.concurrency=true;
> set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
> set hive.materializedview.rewriting.sql=false;
> create table t1(a int, b decimal(7,2)) stored as orc TBLPROPERTIES 
> ('transactional'='true');
> insert into t1(a, b) values(1, 1);
> create materialized view mat1 stored as orc TBLPROPERTIES 
> ('transactional'='true') as
> select t1.a, sum(t1.b) from t1
> group by t1.a;
> insert into t1(a,b) values(2, 5);
> explain cbo alter materialized view mat1 rebuild;
> {code}
> {code}
> java.lang.AssertionError: 
> Cannot add expression of different type to set:
> set type is RecordType(INTEGER $f0, DECIMAL(17, 2) $f1) NOT NULL
> expression type is RecordType(INTEGER $f0, DECIMAL(18, 2) $f1) NOT NULL
> set is 
> rel#388:HiveAggregate.HIVE.[].any(input=HepRelVertex#387,group={0},agg#0=sum($1))
> expression is HiveProject($f0=[$3], $f1=[CASE(IS NULL($1), $4, IS NULL($4), 
> $1, +($4, $1))])
>   HiveFilter(condition=[OR($2, IS NULL($2))])
> HiveJoin(condition=[IS NOT DISTINCT FROM($0, $3)], joinType=[right], 
> algorithm=[none], cost=[not available])
>   HiveProject(a=[$0], _c1=[$1], $f2=[true])
> HiveTableScan(table=[[default, mat1]], table:alias=[default.mat1])
>   HiveAggregate(group=[{0}], agg#0=[sum($1)])
> HiveProject($f0=[$0], $f1=[$1])
>   HiveFilter(condition=[<(1, $4.writeid)])
> HiveTableScan(table=[[default, t1]], table:alias=[t1])
>   at 
> org.apache.calcite.plan.RelOptUtil.verifyTypeEquivalence(RelOptUtil.java:380)
>   at 
> org.apache.calcite.plan.hep.HepRuleCall.transformTo(HepRuleCall.java:58)
>   at 
> org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:268)
>   at 
> org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:283)
>   at 
> org.apache.hadoop.hive.ql.optimizer.calcite.rules.views.HiveAggregateIncrementalRewritingRuleBase.onMatch(HiveAggregateIncrementalRewritingRuleBase.java:161)
>   at 
> org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:333)
>   at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:542)
>   at 
> org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:407)
>   at 
> org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:243)
>   at 
> org.apache.calcite.plan.hep.HepInstruction$RuleInstance.execute(HepInstruction.java:127)
>   at 
> org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:202)
>   at 
> org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:189)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.executeProgram(CalcitePlanner.java:2468)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.executeProgram(CalcitePlanner.java:2427)
>   at 
> org.apache.hadoop.hive.ql.ddl.view.materialized.alter.rebuild.AlterMaterializedViewRebuildAnalyzer$MVRebuildCalcitePlannerAction.applyIncrementalRebuild(AlterMaterializedViewRebuildAnalyzer.java:460)
>   at 
> org.apache.hadoop.hive.ql.ddl.view.materialized.alter.rebuild.AlterMaterializedViewRebuildAnalyzer$MVRebuildCalcitePlannerAction.applyAggregateInsertIncremental(AlterMaterializedViewRebuildAnalyzer.java:352)
>   at 
> org.apache.hadoop.hive.ql.ddl.view.materialized.alter.rebuild.AlterMaterializedViewRebuildAnalyzer$MVRebuildCalcitePlannerAction.applyRecordIncrementalRebuildPlan(AlterMaterializedViewRebuildAnalyzer.java:311)
>   at 
>

[jira] [Created] (HIVE-27311) Improve LDAP auth to support generic search bind authentication

2023-05-03 Thread Naveen Gangam (Jira)

Naveen Gangam created HIVE-27311:


 Summary: Improve LDAP auth to support generic search bind 
authentication
 Key: HIVE-27311
 URL: https://issues.apache.org/jira/browse/HIVE-27311
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 4.0.0-alpha-2
Reporter: Naveen Gangam
Assignee: Naveen Gangam


Hive's LDAP auth configuration is home-baked and a bit specific to hive. This 
was by design intending to be as flexible as it can be for accommodating 
various LDAP implementations. But this does not necessarily make it easy to 
configure hive with such custom values for ldap filtering when most other 
components accept generic ldap filters, for example: search bind filters.

There has to be a layer of translation to have it configured. Instead we can 
enhance Hive to support generic search bind filters.

To support this, I am proposing adding NEW alternate configurations. 
hive.server2.authentication.ldap.userSearchFilter
hive.server2.authentication.ldap.groupSearchFilter
hive.server2.authentication.ldap.groupBaseDN

Search bind filtering will also use EXISTING config param
hive.server2.authentication.ldap.baseDN

This is alternate configuration and will be used first if specified. So users 
can continue to use existing configuration as well. These changes should not 
interfere with existing configurations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-27307) NPE when generating incremental rebuild plan of materialized view with empty Iceberg source table

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27307?focusedWorklogId=860364=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860364
 ]

ASF GitHub Bot logged work on HIVE-27307:
-

Author: ASF GitHub Bot
Created on: 03/May/23 15:02
Start Date: 03/May/23 15:02
Worklog Time Spent: 10m 
  Work Description: kasakrisz commented on code in PR #4279:
URL: https://github.com/apache/hive/pull/4279#discussion_r1183815901


##
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java:
##
@@ -1297,8 +1297,12 @@ public boolean 
shouldOverwrite(org.apache.hadoop.hive.ql.metadata.Table mTable,
   public Boolean hasAppendsOnly(org.apache.hadoop.hive.ql.metadata.Table 
hmsTable, SnapshotContext since) {
 TableDesc tableDesc = Utilities.getTableDesc(hmsTable);
 Table table = IcebergTableUtil.getTable(conf, tableDesc.getProperties());
-boolean foundSince = false;
-for (Snapshot snapshot : table.snapshots()) {
+return hasAppendsOnly(table.snapshots(), since);
+  }
+
+  public Boolean hasAppendsOnly(Iterable snapshots, SnapshotContext 
since) {
+boolean foundSince = since == null;

Review Comment:
   if `since` is null all the snapshots has to be checked if it is not only 
ones which are newer than it.





Issue Time Tracking
---

Worklog Id: (was: 860364)
Time Spent: 40m  (was: 0.5h)

> NPE when generating incremental rebuild plan of materialized view with empty 
> Iceberg source table
> -
>
> Key: HIVE-27307
> URL: https://issues.apache.org/jira/browse/HIVE-27307
> Project: Hive
>  Issue Type: Bug
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> {code}
> set hive.support.concurrency=true;
> set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
> create external table tbl_ice(a int, b string, c int) stored by iceberg 
> stored as orc tblproperties ('format-version'='1');
> create external table tbl_ice_v2(d int, e string, f int) stored by iceberg 
> stored as orc tblproperties ('format-version'='2');
> insert into tbl_ice_v2 values (1, 'one v2', 50), (4, 'four v2', 53), (5, 
> 'five v2', 54);
> create materialized view mat1 as
> select tbl_ice.b, tbl_ice.c, tbl_ice_v2.e from tbl_ice join tbl_ice_v2 on 
> tbl_ice.a=tbl_ice_v2.d where tbl_ice.c > 52;
> -- insert some new values to one of the source tables
> insert into tbl_ice values (1, 'one', 50), (2, 'two', 51), (3, 'three', 52), 
> (4, 'four', 53), (5, 'five', 54);
> alter materialized view mat1 rebuild;
> {code}
> {code}
> 2023-04-28T07:34:17,949  WARN [1fb94a8e-8d75-4a1f-8f44-a5beaa8aafb6 Listener 
> at 0.0.0.0/36857] rebuild.AlterMaterializedViewRebuildAnalyzer: Exception 
> loading materialized views
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.metadata.Hive.getValidMaterializedViews(Hive.java:2298)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.metadata.Hive.getMaterializedViewForRebuild(Hive.java:2204)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.ddl.view.materialized.alter.rebuild.AlterMaterializedViewRebuildAnalyzer$MVRebuildCalcitePlannerAction.applyMaterializedViewRewriting(AlterMaterializedViewRebuildAnalyzer.java:215)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1722)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1591)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.calcite.tools.Frameworks.lambda$withPlanner$0(Frameworks.java:131) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:914)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:180) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:126) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1343)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:570)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
>

[jira] (HIVE-24289) RetryingMetaStoreClient should not retry connecting to HMS on genuine errors

2023-05-03 Thread Dmitriy Fingerman (Jira)



[ https://issues.apache.org/jira/browse/HIVE-24289 ]


Dmitriy Fingerman deleted comment on HIVE-24289:
--

was (Author: JIRAUSER295017):
Hi [~harshit.gupta],

I would be interested to work on this ticket.

I see it is assigned to you, but wasn't updated for long time.

Can I assign it to myself and try fixing it?

> RetryingMetaStoreClient should not retry connecting to HMS on genuine errors
> 
>
> Key: HIVE-24289
> URL: https://issues.apache.org/jira/browse/HIVE-24289
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Harshit Gupta
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> When there is genuine error from HMS, it should not be retried in 
> RetryingMetaStoreClient. 
> For e.g, following query would be retried multiple times (~20+ times) in HMS 
> causing huge delay in processing, even though this constraint is available in 
> HMS. 
> It should just throw exception to client and stop retrying in such cases.
> {noformat}
> alter table web_sales add constraint tpcds_bin_partitioned_orc_1_ws_s_hd 
> foreign key  (ws_ship_hdemo_sk) references household_demographics 
> (hd_demo_sk) disable novalidate rely;
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.thrift.TApplicationException: Internal error processing 
> add_foreign_key
>   at org.apache.hadoop.hive.ql.metadata.Hive.addForeignKey(Hive.java:5914)
> ..
> ...
> Caused by: org.apache.thrift.TApplicationException: Internal error processing 
> add_foreign_key
>at 
> org.apache.thrift.TApplicationException.read(TApplicationException.java:111)
>at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:79)
>at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_add_foreign_key(ThriftHiveMetastore.java:1872)
> {noformat}
> https://github.com/apache/hive/blob/master/standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/RetryingMetaStoreClient.java#L256
> For e.g, if exception contains "Internal error processing ", it could stop 
> retrying all over again.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-27307) NPE when generating incremental rebuild plan of materialized view with empty Iceberg source table

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27307?focusedWorklogId=860349=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860349
 ]

ASF GitHub Bot logged work on HIVE-27307:
-

Author: ASF GitHub Bot
Created on: 03/May/23 13:49
Start Date: 03/May/23 13:49
Worklog Time Spent: 10m 
  Work Description: veghlaci05 commented on code in PR #4279:
URL: https://github.com/apache/hive/pull/4279#discussion_r1183707833


##
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java:
##
@@ -1297,8 +1297,12 @@ public boolean 
shouldOverwrite(org.apache.hadoop.hive.ql.metadata.Table mTable,
   public Boolean hasAppendsOnly(org.apache.hadoop.hive.ql.metadata.Table 
hmsTable, SnapshotContext since) {
 TableDesc tableDesc = Utilities.getTableDesc(hmsTable);
 Table table = IcebergTableUtil.getTable(conf, tableDesc.getProperties());
-boolean foundSince = false;
-for (Snapshot snapshot : table.snapshots()) {
+return hasAppendsOnly(table.snapshots(), since);
+  }
+
+  public Boolean hasAppendsOnly(Iterable snapshots, SnapshotContext 
since) {
+boolean foundSince = since == null;

Review Comment:
   Maybe it's only me, but the name of this variable and the way it is used for 
two different things (is since non-null, and did we find the snapshot) is quite 
confusing. I wouldeither rename it, or split to two variables.



##
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java:
##
@@ -1297,8 +1297,12 @@ public boolean 
shouldOverwrite(org.apache.hadoop.hive.ql.metadata.Table mTable,
   public Boolean hasAppendsOnly(org.apache.hadoop.hive.ql.metadata.Table 
hmsTable, SnapshotContext since) {
 TableDesc tableDesc = Utilities.getTableDesc(hmsTable);
 Table table = IcebergTableUtil.getTable(conf, tableDesc.getProperties());
-boolean foundSince = false;
-for (Snapshot snapshot : table.snapshots()) {
+return hasAppendsOnly(table.snapshots(), since);
+  }
+
+  public Boolean hasAppendsOnly(Iterable snapshots, SnapshotContext 
since) {

Review Comment:
   If this is used only by the tests, please add the @VisibleForTests 
annotation to mark it explicitly





Issue Time Tracking
---

Worklog Id: (was: 860349)
Time Spent: 0.5h  (was: 20m)

> NPE when generating incremental rebuild plan of materialized view with empty 
> Iceberg source table
> -
>
> Key: HIVE-27307
> URL: https://issues.apache.org/jira/browse/HIVE-27307
> Project: Hive
>  Issue Type: Bug
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> {code}
> set hive.support.concurrency=true;
> set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
> create external table tbl_ice(a int, b string, c int) stored by iceberg 
> stored as orc tblproperties ('format-version'='1');
> create external table tbl_ice_v2(d int, e string, f int) stored by iceberg 
> stored as orc tblproperties ('format-version'='2');
> insert into tbl_ice_v2 values (1, 'one v2', 50), (4, 'four v2', 53), (5, 
> 'five v2', 54);
> create materialized view mat1 as
> select tbl_ice.b, tbl_ice.c, tbl_ice_v2.e from tbl_ice join tbl_ice_v2 on 
> tbl_ice.a=tbl_ice_v2.d where tbl_ice.c > 52;
> -- insert some new values to one of the source tables
> insert into tbl_ice values (1, 'one', 50), (2, 'two', 51), (3, 'three', 52), 
> (4, 'four', 53), (5, 'five', 54);
> alter materialized view mat1 rebuild;
> {code}
> {code}
> 2023-04-28T07:34:17,949  WARN [1fb94a8e-8d75-4a1f-8f44-a5beaa8aafb6 Listener 
> at 0.0.0.0/36857] rebuild.AlterMaterializedViewRebuildAnalyzer: Exception 
> loading materialized views
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.metadata.Hive.getValidMaterializedViews(Hive.java:2298)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.metadata.Hive.getMaterializedViewForRebuild(Hive.java:2204)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.ddl.view.materialized.alter.rebuild.AlterMaterializedViewRebuildAnalyzer$MVRebuildCalcitePlannerAction.applyMaterializedViewRewriting(AlterMaterializedViewRebuildAnalyzer.java:215)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1722)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
>

[jira] [Commented] (HIVE-27308) Exposing client keystore and truststore passwords in the JDBC URL can be a security concern

2023-05-03 Thread Venugopal Reddy K (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-27308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17718937#comment-17718937
 ] 

Venugopal Reddy K commented on HIVE-27308:
--

[~leftyl] As part of this issue, i need to update cwiki page 
[Link|[https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients].] 
Could you please help me on how do i get edit access to cwiki.

> Exposing client keystore and truststore passwords in the JDBC URL can be a 
> security concern
> ---
>
> Key: HIVE-27308
> URL: https://issues.apache.org/jira/browse/HIVE-27308
> Project: Hive
>  Issue Type: Improvement
>Reporter: Venugopal Reddy K
>Assignee: Venugopal Reddy K
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> At present, we may have the following keystore and truststore passwords in 
> the JDBC URL.
>  # trustStorePassword
>  # keyStorePassword
>  # zooKeeperTruststorePassword
>  # zooKeeperKeystorePassword
> Exposing these passwords in URL can be a security concern. Can hide all these 
> passwords from JDBC URL when we protect these passwords in a local JCEKS 
> keystore file and pass the JCEKS file to URL instead.
> 1. Leverage the hadoop credential provider 
> [Link|https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/CredentialProviderAPI.html#Overview]
>  Create aliases for these passwords in a local JCE keystore like below. Store 
> all the passwords in the same JCEKS files.
> {{hadoop credential create *keyStorePassword* -value 
> FDUxmzTxW15xWoaCk6GxLlaoHjnjV9H7iHqCIDxTwoq -provider 
> localjceks://file/tmp/store/client_creds.jceks}}
> 2. Add a new option *storePasswordPath* to JDBC URL that point to the local 
> JCE keystore file storing the password aliases. When the existing password 
> option is present in URL, can ignore to fetch that particular alias from 
> local jceks(i.e., giving preference to existing password option). And if 
> password option is not present in URL, can fetch the password from local 
> jceks.
> JDBC URL may look like: 
> {{beeline -u 
> "jdbc:hive2://kvr-host:10001/default;retries=5;ssl=true;sslTrustStore=/tmp/truststore.jks;transportMode=http;httpPath=cliservice;twoWay=true;sslKeyStore=/tmp/keystore.jks;{*}storePasswordPath=localjceks://file/tmp/client_creds.jceks;{*}"}}
> 3. Hive JDBC can fetch the passwords with 
> [Configuration.getPassword|https://hadoop.apache.org/docs/stable/api/org/apache/hadoop/conf/Configuration.html#getPassword-java.lang.String-]
>  API



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (HIVE-27308) Exposing client keystore and truststore passwords in the JDBC URL can be a security concern

2023-05-03 Thread Venugopal Reddy K (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-27308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17718937#comment-17718937
 ] 

Venugopal Reddy K edited comment on HIVE-27308 at 5/3/23 1:38 PM:
--

[~leftyl] As part of this issue, i need to update cwiki page 
[https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients] Could 
you please help me on how do i get edit access to cwiki.


was (Author: venureddy):
[~leftyl] As part of this issue, i need to update cwiki page 
[Link|[https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients].] 
Could you please help me on how do i get edit access to cwiki.

> Exposing client keystore and truststore passwords in the JDBC URL can be a 
> security concern
> ---
>
> Key: HIVE-27308
> URL: https://issues.apache.org/jira/browse/HIVE-27308
> Project: Hive
>  Issue Type: Improvement
>Reporter: Venugopal Reddy K
>Assignee: Venugopal Reddy K
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> At present, we may have the following keystore and truststore passwords in 
> the JDBC URL.
>  # trustStorePassword
>  # keyStorePassword
>  # zooKeeperTruststorePassword
>  # zooKeeperKeystorePassword
> Exposing these passwords in URL can be a security concern. Can hide all these 
> passwords from JDBC URL when we protect these passwords in a local JCEKS 
> keystore file and pass the JCEKS file to URL instead.
> 1. Leverage the hadoop credential provider 
> [Link|https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/CredentialProviderAPI.html#Overview]
>  Create aliases for these passwords in a local JCE keystore like below. Store 
> all the passwords in the same JCEKS files.
> {{hadoop credential create *keyStorePassword* -value 
> FDUxmzTxW15xWoaCk6GxLlaoHjnjV9H7iHqCIDxTwoq -provider 
> localjceks://file/tmp/store/client_creds.jceks}}
> 2. Add a new option *storePasswordPath* to JDBC URL that point to the local 
> JCE keystore file storing the password aliases. When the existing password 
> option is present in URL, can ignore to fetch that particular alias from 
> local jceks(i.e., giving preference to existing password option). And if 
> password option is not present in URL, can fetch the password from local 
> jceks.
> JDBC URL may look like: 
> {{beeline -u 
> "jdbc:hive2://kvr-host:10001/default;retries=5;ssl=true;sslTrustStore=/tmp/truststore.jks;transportMode=http;httpPath=cliservice;twoWay=true;sslKeyStore=/tmp/keystore.jks;{*}storePasswordPath=localjceks://file/tmp/client_creds.jceks;{*}"}}
> 3. Hive JDBC can fetch the passwords with 
> [Configuration.getPassword|https://hadoop.apache.org/docs/stable/api/org/apache/hadoop/conf/Configuration.html#getPassword-java.lang.String-]
>  API



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (HIVE-27308) Exposing client keystore and truststore passwords in the JDBC URL can be a security concern

2023-05-03 Thread Venugopal Reddy K (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venugopal Reddy K reassigned HIVE-27308:


Assignee: Venugopal Reddy K

> Exposing client keystore and truststore passwords in the JDBC URL can be a 
> security concern
> ---
>
> Key: HIVE-27308
> URL: https://issues.apache.org/jira/browse/HIVE-27308
> Project: Hive
>  Issue Type: Improvement
>Reporter: Venugopal Reddy K
>Assignee: Venugopal Reddy K
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> At present, we may have the following keystore and truststore passwords in 
> the JDBC URL.
>  # trustStorePassword
>  # keyStorePassword
>  # zooKeeperTruststorePassword
>  # zooKeeperKeystorePassword
> Exposing these passwords in URL can be a security concern. Can hide all these 
> passwords from JDBC URL when we protect these passwords in a local JCEKS 
> keystore file and pass the JCEKS file to URL instead.
> 1. Leverage the hadoop credential provider 
> [Link|https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/CredentialProviderAPI.html#Overview]
>  Create aliases for these passwords in a local JCE keystore like below. Store 
> all the passwords in the same JCEKS files.
> {{hadoop credential create *keyStorePassword* -value 
> FDUxmzTxW15xWoaCk6GxLlaoHjnjV9H7iHqCIDxTwoq -provider 
> localjceks://file/tmp/store/client_creds.jceks}}
> 2. Add a new option *storePasswordPath* to JDBC URL that point to the local 
> JCE keystore file storing the password aliases. When the existing password 
> option is present in URL, can ignore to fetch that particular alias from 
> local jceks(i.e., giving preference to existing password option). And if 
> password option is not present in URL, can fetch the password from local 
> jceks.
> JDBC URL may look like: 
> {{beeline -u 
> "jdbc:hive2://kvr-host:10001/default;retries=5;ssl=true;sslTrustStore=/tmp/truststore.jks;transportMode=http;httpPath=cliservice;twoWay=true;sslKeyStore=/tmp/keystore.jks;{*}storePasswordPath=localjceks://file/tmp/client_creds.jceks;{*}"}}
> 3. Hive JDBC can fetch the passwords with 
> [Configuration.getPassword|https://hadoop.apache.org/docs/stable/api/org/apache/hadoop/conf/Configuration.html#getPassword-java.lang.String-]
>  API



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-24515) Analyze table job can be skipped when stats populated are already accurate

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24515?focusedWorklogId=860336=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860336
 ]

ASF GitHub Bot logged work on HIVE-24515:
-

Author: ASF GitHub Bot
Created on: 03/May/23 13:32
Start Date: 03/May/23 13:32
Worklog Time Spent: 10m 
  Work Description: difin commented on PR #1834:
URL: https://github.com/apache/hive/pull/1834#issuecomment-1533037785

   Hi @maheshk114,
   I would like to continue working on this PR. Can you please give me 
permissions to push to your branch [mahesh/HIVE-24515]? I am getting access 
denied now.




Issue Time Tracking
---

Worklog Id: (was: 860336)
Time Spent: 3h 40m  (was: 3.5h)

> Analyze table job can be skipped when stats populated are already accurate
> --
>
> Key: HIVE-24515
> URL: https://issues.apache.org/jira/browse/HIVE-24515
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> For non-partitioned tables, stats detail should be present in table level,
> e.g
> {noformat}
> COLUMN_STATS_ACCURATE={"BASIC_STATS":"true","COLUMN_STATS":{"d_current_day":"true"...
>  }}
>   {noformat}
> For partitioned tables, stats detail should be present in partition level,
> {noformat}
> store_sales(ss_sold_date_sk=2451819)
> {totalSize=0, numRows=0, rawDataSize=0, 
> COLUMN_STATS_ACCURATE={"BASIC_STATS":"true","COLUMN_STATS":{"ss_addr_sk":"true"}}
>  
>  {noformat}
> When stats populated are already accurate, {{analyze table tn compute 
> statistics for columns}} should skip launching the job.
>  
> For ACID tables, stats are auto computed and it can skip computing stats 
> again when stats are accurate.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-23394) TestJdbcGenericUDTFGetSplits2#testGenericUDTFOrderBySplitCount1 is flaky

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23394?focusedWorklogId=860326=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860326
 ]

ASF GitHub Bot logged work on HIVE-23394:
-

Author: ASF GitHub Bot
Created on: 03/May/23 12:45
Start Date: 03/May/23 12:45
Worklog Time Spent: 10m 
  Work Description: simhadri-g commented on code in PR #4249:
URL: https://github.com/apache/hive/pull/4249#discussion_r1183521173


##
itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcGenericUDTFGetSplits.java:
##
@@ -38,14 +38,16 @@
 public class TestJdbcGenericUDTFGetSplits extends 
AbstractTestJdbcGenericUDTFGetSplits {
 
   @Test(timeout = 20)
-  @Ignore("HIVE-23394")
   public void testGenericUDTFOrderBySplitCount1() throws Exception {
-super.testGenericUDTFOrderBySplitCount1("get_splits", new int[]{10, 1, 0, 
2, 2, 2, 1, 10});
+super.testGenericUDTFOrderBySplitCount1("get_splits", new int[] { 10, 5, 
0, 2, 2, 2, 5 });
+super.testGenericUDTFOrderBySplitCount1("get_llap_splits", new int[] { 12, 
7, 1, 4, 4, 4, 7 });
   }
 
+
   @Test(timeout = 20)
   public void testGenericUDTFOrderBySplitCount1OnPartitionedTable() throws 
Exception {
 super.testGenericUDTFOrderBySplitCount1OnPartitionedTable("get_splits", 
new int[]{5, 5, 1, 1, 1});
+
super.testGenericUDTFOrderBySplitCount1OnPartitionedTable("get_llap_splits", 
new int[]{7, 7, 3, 3, 3});

Review Comment:
   Since both the tests have almost the exact same code to drive the test, I 
thought it would be good to merge it into a single file to reduce code 
duplication. Also, both the udtf are to test JdbcGenericUDTFGetSplits, 
therefore i though it would be good to merge them.





Issue Time Tracking
---

Worklog Id: (was: 860326)
Time Spent: 3h  (was: 2h 50m)

> TestJdbcGenericUDTFGetSplits2#testGenericUDTFOrderBySplitCount1 is flaky
> 
>
> Key: HIVE-23394
> URL: https://issues.apache.org/jira/browse/HIVE-23394
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Simhadri Govindappa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> both 
> TestJdbcGenericUDTFGetSplits2.testGenericUDTFOrderBySplitCount1 and
> TestJdbcGenericUDTFGetSplits.testGenericUDTFOrderBySplitCount1
> can fail with the exception below
> seems like the connection was lost
> {code}
> Error Message
> Failed to close statement
> Stacktrace
> java.sql.SQLException: Failed to close statement
>   at 
> org.apache.hive.jdbc.HiveStatement.closeStatementIfNeeded(HiveStatement.java:200)
>   at 
> org.apache.hive.jdbc.HiveStatement.closeClientOperation(HiveStatement.java:205)
>   at org.apache.hive.jdbc.HiveStatement.close(HiveStatement.java:222)
>   at 
> org.apache.hive.jdbc.AbstractTestJdbcGenericUDTFGetSplits.runQuery(AbstractTestJdbcGenericUDTFGetSplits.java:135)
>   at 
> org.apache.hive.jdbc.AbstractTestJdbcGenericUDTFGetSplits.testGenericUDTFOrderBySplitCount1(AbstractTestJdbcGenericUDTFGetSplits.java:164)
>   at 
> org.apache.hive.jdbc.TestJdbcGenericUDTFGetSplits2.testGenericUDTFOrderBySplitCount1(TestJdbcGenericUDTFGetSplits2.java:28)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> Caused by: org.apache.thrift.TApplicationException: CloseOperation failed: 
> out of sequence response
>   at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:84)
>   at 
> org.apache.hive.service.rpc.thrift.TCLIService$Client.recv_CloseOperation(TCLIService.java:521)
>   at 
> org.apache.hive.service.rpc.thrift.TCLIService$Client.CloseOperation(TCLIService.java:508)
>   at sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
>

[jira] [Resolved] (HIVE-26659) TPC-DS query 16, 69, 94 return wrong results.

2023-05-03 Thread Krisztian Kasa (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa resolved HIVE-26659.
---
Resolution: Fixed

Merged to master. Thanks [~seonggon] for the patch and [~amansinha], 
[~scarlin], [~ramesh87] for review.

> TPC-DS query 16, 69, 94 return wrong results.
> -
>
> Key: HIVE-26659
> URL: https://issues.apache.org/jira/browse/HIVE-26659
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 4.0.0-alpha-2
>Reporter: Sungwoo Park
>Assignee: Seonggon Namgung
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> TPC-DS query 16, 69, 94 return wrong results when hive.auto.convert.anti.join 
> is set to true.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-26659) TPC-DS query 16, 69, 94 return wrong results.

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26659?focusedWorklogId=860318=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860318
 ]

ASF GitHub Bot logged work on HIVE-26659:
-

Author: ASF GitHub Bot
Created on: 03/May/23 12:13
Start Date: 03/May/23 12:13
Worklog Time Spent: 10m 
  Work Description: kasakrisz merged PR #4190:
URL: https://github.com/apache/hive/pull/4190




Issue Time Tracking
---

Worklog Id: (was: 860318)
Time Spent: 2h 20m  (was: 2h 10m)

> TPC-DS query 16, 69, 94 return wrong results.
> -
>
> Key: HIVE-26659
> URL: https://issues.apache.org/jira/browse/HIVE-26659
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 4.0.0-alpha-2
>Reporter: Sungwoo Park
>Assignee: Seonggon Namgung
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> TPC-DS query 16, 69, 94 return wrong results when hive.auto.convert.anti.join 
> is set to true.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-27187) Incremental rebuild of materialized view having aggregate and stored by iceberg

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27187?focusedWorklogId=860300=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860300
 ]

ASF GitHub Bot logged work on HIVE-27187:
-

Author: ASF GitHub Bot
Created on: 03/May/23 11:48
Start Date: 03/May/23 11:48
Worklog Time Spent: 10m 
  Work Description: sonarcloud[bot] commented on PR #4278:
URL: https://github.com/apache/hive/pull/4278#issuecomment-1532887376

   Kudos, SonarCloud Quality Gate passed!  [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive=4278)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=4278=false=BUG)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4278=false=BUG)
 [0 
Bugs](https://sonarcloud.io/project/issues?id=apache_hive=4278=false=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=4278=false=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4278=false=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=4278=false=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4278=false=SECURITY_HOTSPOT)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4278=false=SECURITY_HOTSPOT)
 [0 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4278=false=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive=4278=false=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4278=false=CODE_SMELL)
 [1 Code 
Smell](https://sonarcloud.io/project/issues?id=apache_hive=4278=false=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4278=coverage=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4278=duplicated_lines_density=list)
 No Duplication information
   
   




Issue Time Tracking
---

Worklog Id: (was: 860300)
Time Spent: 5h  (was: 4h 50m)

> Incremental rebuild of materialized view having aggregate and stored by 
> iceberg
> ---
>
> Key: HIVE-27187
> URL: https://issues.apache.org/jira/browse/HIVE-27187
> Project: Hive
>  Issue Type: Improvement
>  Components: Iceberg integration, Materialized views
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> Currently incremental rebuild of materialized view stored by iceberg which 
> definition query contains aggregate operator is transformed to an insert 
> overwrite statement which contains a union operator if the source tables 
> contains insert operations only. One branch of the union scans the view the 
> other produces the delta.
> This can be improved further: transform the statement to a multi insert 
> statement representing a merge statement to insert new aggregations and 
> update existing.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-24515) Analyze table job can be skipped when stats populated are already accurate

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24515?focusedWorklogId=860290=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860290
 ]

ASF GitHub Bot logged work on HIVE-24515:
-

Author: ASF GitHub Bot
Created on: 03/May/23 11:08
Start Date: 03/May/23 11:08
Worklog Time Spent: 10m 
  Work Description: maheshk114 opened a new pull request, #1834:
URL: https://github.com/apache/hive/pull/1834

   …
   
   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   




Issue Time Tracking
---

Worklog Id: (was: 860290)
Time Spent: 3.5h  (was: 3h 20m)

> Analyze table job can be skipped when stats populated are already accurate
> --
>
> Key: HIVE-24515
> URL: https://issues.apache.org/jira/browse/HIVE-24515
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> For non-partitioned tables, stats detail should be present in table level,
> e.g
> {noformat}
> COLUMN_STATS_ACCURATE={"BASIC_STATS":"true","COLUMN_STATS":{"d_current_day":"true"...
>  }}
>   {noformat}
> For partitioned tables, stats detail should be present in partition level,
> {noformat}
> store_sales(ss_sold_date_sk=2451819)
> {totalSize=0, numRows=0, rawDataSize=0, 
> COLUMN_STATS_ACCURATE={"BASIC_STATS":"true","COLUMN_STATS":{"ss_addr_sk":"true"}}
>  
>  {noformat}
> When stats populated are already accurate, {{analyze table tn compute 
> statistics for columns}} should skip launching the job.
>  
> For ACID tables, stats are auto computed and it can skip computing stats 
> again when stats are accurate.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-27118) implement array_intersect UDF in Hive

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27118?focusedWorklogId=860289=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860289
 ]

ASF GitHub Bot logged work on HIVE-27118:
-

Author: ASF GitHub Bot
Created on: 03/May/23 10:54
Start Date: 03/May/23 10:54
Worklog Time Spent: 10m 
  Work Description: tarak271 commented on code in PR #4094:
URL: https://github.com/apache/hive/pull/4094#discussion_r1183532348


##
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFArrayIntersect.java:
##
@@ -0,0 +1,61 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file intersect in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.udf.generic;
+
+import org.apache.hadoop.hive.ql.exec.Description;
+import org.apache.hadoop.hive.ql.exec.UDFArgumentException;
+import org.apache.hadoop.hive.ql.metadata.HiveException;
+import org.apache.hadoop.hive.serde2.objectinspector.ListObjectInspector;
+import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;
+
+import java.util.List;
+import java.util.stream.Collectors;
+
+/**
+ * GenericUDFArrayIntersect.
+ */
+@Description(name = "array_intersect", value = "_FUNC_(array1, array2) - 
Returns an array of the elements in the intersection of array1 and array2, 
without duplicates.", extended =
+"Example:\n" + "  > SELECT _FUNC_(array(1, 2, 3,4), array(1,2,3)) FROM src 
LIMIT 1;\n"
++ "  [1,2]") public class GenericUDFArrayIntersect extends 
AbstractGenericUDFArrayBase {
+  static final int ARRAY2_IDX = 1;

Review Comment:
   Out of 7 UDFs we have two array inputs for 2 only. So keeping it in child 
classes





Issue Time Tracking
---

Worklog Id: (was: 860289)
Time Spent: 1h  (was: 50m)

> implement array_intersect UDF in Hive
> -
>
> Key: HIVE-27118
> URL: https://issues.apache.org/jira/browse/HIVE-27118
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Taraka Rama Rao Lethavadla
>Assignee: Taraka Rama Rao Lethavadla
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> *array_intersect(array1, array2)*
> {{Returns an array of the elements in the intersection of array1}} and 
> {{{}array2{}}}, without duplicates.
>  
> {noformat}
> > SELECT array_intersect(array(1, 2, 2, 3), array(1, 1, 3, 5));
> [1,3]
> {noformat}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-27118) implement array_intersect UDF in Hive

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27118?focusedWorklogId=860288=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860288
 ]

ASF GitHub Bot logged work on HIVE-27118:
-

Author: ASF GitHub Bot
Created on: 03/May/23 10:53
Start Date: 03/May/23 10:53
Worklog Time Spent: 10m 
  Work Description: tarak271 commented on code in PR #4094:
URL: https://github.com/apache/hive/pull/4094#discussion_r1183531693


##
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFArrayIntersect.java:
##
@@ -0,0 +1,61 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file intersect in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.udf.generic;
+
+import org.apache.hadoop.hive.ql.exec.Description;
+import org.apache.hadoop.hive.ql.exec.UDFArgumentException;
+import org.apache.hadoop.hive.ql.metadata.HiveException;
+import org.apache.hadoop.hive.serde2.objectinspector.ListObjectInspector;
+import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;
+
+import java.util.List;
+import java.util.stream.Collectors;
+
+/**
+ * GenericUDFArrayIntersect.
+ */
+@Description(name = "array_intersect", value = "_FUNC_(array1, array2) - 
Returns an array of the elements in the intersection of array1 and array2, 
without duplicates.", extended =
+"Example:\n" + "  > SELECT _FUNC_(array(1, 2, 3,4), array(1,2,3)) FROM src 
LIMIT 1;\n"
++ "  [1,2]") public class GenericUDFArrayIntersect extends 
AbstractGenericUDFArrayBase {

Review Comment:
   corrected the end result



##
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFArrayIntersect.java:
##
@@ -0,0 +1,61 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file intersect in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.udf.generic;
+
+import org.apache.hadoop.hive.ql.exec.Description;
+import org.apache.hadoop.hive.ql.exec.UDFArgumentException;
+import org.apache.hadoop.hive.ql.metadata.HiveException;
+import org.apache.hadoop.hive.serde2.objectinspector.ListObjectInspector;
+import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;
+
+import java.util.List;
+import java.util.stream.Collectors;
+
+/**
+ * GenericUDFArrayIntersect.
+ */
+@Description(name = "array_intersect", value = "_FUNC_(array1, array2) - 
Returns an array of the elements in the intersection of array1 and array2, 
without duplicates.", extended =
+"Example:\n" + "  > SELECT _FUNC_(array(1, 2, 3,4), array(1,2,3)) FROM src 
LIMIT 1;\n"

Review Comment:
   removed limit clause





Issue Time Tracking
---

Worklog Id: (was: 860288)
Time Spent: 50m  (was: 40m)

> implement array_intersect UDF in Hive
> -
>
> Key: HIVE-27118
> URL: https://issues.apache.org/jira/browse/HIVE-27118
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Taraka Rama Rao Lethavadla
>Assignee: Taraka Rama Rao Lethavadla
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> *array_intersect(array1, array2)*
> {{Returns an array of the elements in the intersection of array1}} and 
> {{{}array2{}}}, without duplicates.
>  
> {noformat}
> > SELECT array_intersect(array(1, 2, 2, 3), array(1, 1, 3, 5));
> [1,3]
> {noformat}
>  



--
This

[jira] [Work logged] (HIVE-27234) Iceberg: CREATE BRANCH SQL implementation

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27234?focusedWorklogId=860287=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860287
 ]

ASF GitHub Bot logged work on HIVE-27234:
-

Author: ASF GitHub Bot
Created on: 03/May/23 10:51
Start Date: 03/May/23 10:51
Worklog Time Spent: 10m 
  Work Description: zhangbutao commented on code in PR #4216:
URL: https://github.com/apache/hive/pull/4216#discussion_r1183529581


##
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java:
##
@@ -676,6 +678,35 @@ public void 
executeOperation(org.apache.hadoop.hive.ql.metadata.Table hmsTable,
 }
   }
 
+  @Override
+  public void createBranchOperation(org.apache.hadoop.hive.ql.metadata.Table 
hmsTable,
+  AlterTableCreateBranchSpec createBranchSpec) {
+TableDesc tableDesc = Utilities.getTableDesc(hmsTable);
+Table icebergTable = IcebergTableUtil.getTable(conf, 
tableDesc.getProperties());
+
+String branchName = createBranchSpec.getBranchName();
+Optional.ofNullable(icebergTable.currentSnapshot()).orElseThrow(() -> new 
UnsupportedOperationException(
+String.format("Cannot create branch %s on iceberg table %s.%s which 
has no snapshot",
+branchName, hmsTable.getDbName(), hmsTable.getTableName(;
+Long snapshotId = Optional.ofNullable(createBranchSpec.getSnapshotId())
+.orElse(icebergTable.currentSnapshot().snapshotId());
+LOG.info("Creating branch {} on iceberg table {}.{}", branchName, 
hmsTable.getDbName(),
+hmsTable.getTableName());
+ManageSnapshots manageSnapshots = icebergTable.manageSnapshots();
+manageSnapshots.createBranch(branchName, snapshotId);

Review Comment:
   If user omits ` snapshotId ` when creating branch, then current snapshotId 
will be used.
   if user passes a non-existent `snapshotId` when creating branch, exception 
`unknown snapshot` will be thrown from iceberg lib.





Issue Time Tracking
---

Worklog Id: (was: 860287)
Time Spent: 6h 10m  (was: 6h)

> Iceberg:  CREATE BRANCH SQL implementation
> --
>
> Key: HIVE-27234
> URL: https://issues.apache.org/jira/browse/HIVE-27234
> Project: Hive
>  Issue Type: Sub-task
>  Components: Iceberg integration
>Reporter: zhangbutao
>Assignee: zhangbutao
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> Maybe we can follow spark sql about branch ddl implementation 
> [https://github.com/apache/iceberg/pull/6617]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-24515) Analyze table job can be skipped when stats populated are already accurate

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24515?focusedWorklogId=860286=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860286
 ]

ASF GitHub Bot logged work on HIVE-24515:
-

Author: ASF GitHub Bot
Created on: 03/May/23 10:50
Start Date: 03/May/23 10:50
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #1834:
URL: https://github.com/apache/hive/pull/1834#discussion_r570565029


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:
##
@@ -4797,6 +4800,84 @@ private void heartbeatTxn(Connection dbConn, long txnid)
 }
   }
 
+  private boolean foundCommittedTransaction(Connection dbConn, long txnId, 
FindStatStatusByWriteIdRequest rqst,
+   String condition) throws 
SQLException, MetaException {
+String s = sqlGenerator.addLimitClause(1,
+"1 FROM \"COMPLETED_TXN_COMPONENTS\" WHERE \"CTC_TXNID\" " + 
condition + " " + txnId +
+" AND \"CTC_DATABASE\" = ? AND \"CTC_TABLE\" = ?");
+if (rqst.getPartName() != null) {
+  s += " AND \"CTC_PARTITION\" = ?";
+}
+
+try (PreparedStatement pStmt =
+   sqlGenerator.prepareStmtWithParameters(dbConn, s,  
Arrays.asList(rqst.getDbName(), rqst.getTblName( {
+  if (rqst.getPartName() != null) {
+pStmt.setString(3, rqst.getPartName());
+  }
+  LOG.debug("Going to execute query <" + s + ">");
+  try (ResultSet rs2 = pStmt.executeQuery()) {
+if (rs2.next()) {
+  return true;
+}
+  }
+}
+return false;
+  }
+
+  @Override
+  @RetrySemantics.Idempotent
+  public FindStatStatusByWriteIdResponse 
findStatStatusByWriteId(FindStatStatusByWriteIdRequest rqst)
+  throws SQLException, MetaException {
+try {
+  Connection dbConn = null;
+  Statement stmt = null;
+  try {
+lockInternal();
+dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED);
+stmt = dbConn.createStatement();
+TxnState state;
+long txnId = getTxnIdForWriteId(rqst.getDbName(), rqst.getTblName(), 
rqst.getWriteId());
+TxnStatus txnStatus = findTxnState(txnId, stmt);
+if (txnStatus == TxnStatus.ABORTED) {
+  state = TxnState.ABORTED;
+} else if (txnStatus == TxnStatus.OPEN) {
+  state = TxnState.OPEN;
+} else if (foundCommittedTransaction(dbConn, txnId, rqst, ">")) {

Review Comment:
   That's not entirely correct. Txn with higher txnId might be commited before 
txn with lower id. See how WRITE_SET table works. It has 2 properties WS_TXNID 
and WS_COMMIT_ID. This table only tracks update/delete operations, that are 
conflicting, insert doesn't belong to this category, so you cannot rely on it 
as well.





Issue Time Tracking
---

Worklog Id: (was: 860286)
Time Spent: 3h 20m  (was: 3h 10m)

> Analyze table job can be skipped when stats populated are already accurate
> --
>
> Key: HIVE-24515
> URL: https://issues.apache.org/jira/browse/HIVE-24515
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> For non-partitioned tables, stats detail should be present in table level,
> e.g
> {noformat}
> COLUMN_STATS_ACCURATE={"BASIC_STATS":"true","COLUMN_STATS":{"d_current_day":"true"...
>  }}
>   {noformat}
> For partitioned tables, stats detail should be present in partition level,
> {noformat}
> store_sales(ss_sold_date_sk=2451819)
> {totalSize=0, numRows=0, rawDataSize=0, 
> COLUMN_STATS_ACCURATE={"BASIC_STATS":"true","COLUMN_STATS":{"ss_addr_sk":"true"}}
>  
>  {noformat}
> When stats populated are already accurate, {{analyze table tn compute 
> statistics for columns}} should skip launching the job.
>  
> For ACID tables, stats are auto computed and it can skip computing stats 
> again when stats are accurate.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-23394) TestJdbcGenericUDTFGetSplits2#testGenericUDTFOrderBySplitCount1 is flaky

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23394?focusedWorklogId=860282=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860282
 ]

ASF GitHub Bot logged work on HIVE-23394:
-

Author: ASF GitHub Bot
Created on: 03/May/23 10:45
Start Date: 03/May/23 10:45
Worklog Time Spent: 10m 
  Work Description: simhadri-g commented on code in PR #4249:
URL: https://github.com/apache/hive/pull/4249#discussion_r1183524657


##
itests/hive-unit/src/test/java/org/apache/hive/jdbc/AbstractTestJdbcGenericUDTFGetSplits.java:
##
@@ -179,15 +176,9 @@ protected void testGenericUDTFOrderBySplitCount1(String 
udtfName, int[] expected
 query = "select " + udtfName + "(" + "'select value from " + tableName + " 
where value is not null limit 2', 5)";
 runQuery(query, getConfigs(), expectedCounts[5]);
 
-query = "select " + udtfName + "(" +
-"'select `value` from (select value from " + tableName + " where value 
is not null order by value) as t', 5)";
+query = "select " + udtfName + "(" + "'select `value` from (select value 
from " + tableName +
+" where value is not null order by value) as t', 5)";
 runQuery(query, getConfigs(), expectedCounts[6]);
-
-List setCmds = getConfigs();
-setCmds.add("set 
hive.llap.external.splits.order.by.force.single.split=false");
-query = "select " + udtfName + "(" +
-"'select `value` from (select value from " + tableName + " where value 
is not null order by value) as t', 10)";
-runQuery(query, setCmds, expectedCounts[7]);

Review Comment:
   Done, there were 8 values before. I have updated the expected count to use 7 
values now. 
   
https://github.com/apache/hive/pull/4249/files#diff-47b7f6f11c3aabbf6811b6d4d2e48162b68e4d9f30241b956e17b0d3960ec223L43





Issue Time Tracking
---

Worklog Id: (was: 860282)
Time Spent: 2h 50m  (was: 2h 40m)

> TestJdbcGenericUDTFGetSplits2#testGenericUDTFOrderBySplitCount1 is flaky
> 
>
> Key: HIVE-23394
> URL: https://issues.apache.org/jira/browse/HIVE-23394
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Simhadri Govindappa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> both 
> TestJdbcGenericUDTFGetSplits2.testGenericUDTFOrderBySplitCount1 and
> TestJdbcGenericUDTFGetSplits.testGenericUDTFOrderBySplitCount1
> can fail with the exception below
> seems like the connection was lost
> {code}
> Error Message
> Failed to close statement
> Stacktrace
> java.sql.SQLException: Failed to close statement
>   at 
> org.apache.hive.jdbc.HiveStatement.closeStatementIfNeeded(HiveStatement.java:200)
>   at 
> org.apache.hive.jdbc.HiveStatement.closeClientOperation(HiveStatement.java:205)
>   at org.apache.hive.jdbc.HiveStatement.close(HiveStatement.java:222)
>   at 
> org.apache.hive.jdbc.AbstractTestJdbcGenericUDTFGetSplits.runQuery(AbstractTestJdbcGenericUDTFGetSplits.java:135)
>   at 
> org.apache.hive.jdbc.AbstractTestJdbcGenericUDTFGetSplits.testGenericUDTFOrderBySplitCount1(AbstractTestJdbcGenericUDTFGetSplits.java:164)
>   at 
> org.apache.hive.jdbc.TestJdbcGenericUDTFGetSplits2.testGenericUDTFOrderBySplitCount1(TestJdbcGenericUDTFGetSplits2.java:28)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> Caused by: org.apache.thrift.TApplicationException: CloseOperation failed: 
> out of sequence response
>   at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:84)
>   at 
> org.apache.hive.service.rpc.thrift.TCLIService$Client.recv_CloseOperation(TCLIService.java:521)
>   at 
> org.apache.hive.service.rpc.thrift.TCLIService$Client.CloseOperation(TCLIService.java:508)
>   at sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at

[jira] [Work logged] (HIVE-23394) TestJdbcGenericUDTFGetSplits2#testGenericUDTFOrderBySplitCount1 is flaky

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23394?focusedWorklogId=860280=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860280
 ]

ASF GitHub Bot logged work on HIVE-23394:
-

Author: ASF GitHub Bot
Created on: 03/May/23 10:45
Start Date: 03/May/23 10:45
Worklog Time Spent: 10m 
  Work Description: simhadri-g commented on code in PR #4249:
URL: https://github.com/apache/hive/pull/4249#discussion_r1183524657


##
itests/hive-unit/src/test/java/org/apache/hive/jdbc/AbstractTestJdbcGenericUDTFGetSplits.java:
##
@@ -179,15 +176,9 @@ protected void testGenericUDTFOrderBySplitCount1(String 
udtfName, int[] expected
 query = "select " + udtfName + "(" + "'select value from " + tableName + " 
where value is not null limit 2', 5)";
 runQuery(query, getConfigs(), expectedCounts[5]);
 
-query = "select " + udtfName + "(" +
-"'select `value` from (select value from " + tableName + " where value 
is not null order by value) as t', 5)";
+query = "select " + udtfName + "(" + "'select `value` from (select value 
from " + tableName +
+" where value is not null order by value) as t', 5)";
 runQuery(query, getConfigs(), expectedCounts[6]);
-
-List setCmds = getConfigs();
-setCmds.add("set 
hive.llap.external.splits.order.by.force.single.split=false");
-query = "select " + udtfName + "(" +
-"'select `value` from (select value from " + tableName + " where value 
is not null order by value) as t', 10)";
-runQuery(query, setCmds, expectedCounts[7]);

Review Comment:
   Done, there were 8 values before. I have updated the expected count to use 7 
values not. 
   
https://github.com/apache/hive/pull/4249/files#diff-47b7f6f11c3aabbf6811b6d4d2e48162b68e4d9f30241b956e17b0d3960ec223L43





Issue Time Tracking
---

Worklog Id: (was: 860280)
Time Spent: 2h 40m  (was: 2.5h)

> TestJdbcGenericUDTFGetSplits2#testGenericUDTFOrderBySplitCount1 is flaky
> 
>
> Key: HIVE-23394
> URL: https://issues.apache.org/jira/browse/HIVE-23394
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Simhadri Govindappa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> both 
> TestJdbcGenericUDTFGetSplits2.testGenericUDTFOrderBySplitCount1 and
> TestJdbcGenericUDTFGetSplits.testGenericUDTFOrderBySplitCount1
> can fail with the exception below
> seems like the connection was lost
> {code}
> Error Message
> Failed to close statement
> Stacktrace
> java.sql.SQLException: Failed to close statement
>   at 
> org.apache.hive.jdbc.HiveStatement.closeStatementIfNeeded(HiveStatement.java:200)
>   at 
> org.apache.hive.jdbc.HiveStatement.closeClientOperation(HiveStatement.java:205)
>   at org.apache.hive.jdbc.HiveStatement.close(HiveStatement.java:222)
>   at 
> org.apache.hive.jdbc.AbstractTestJdbcGenericUDTFGetSplits.runQuery(AbstractTestJdbcGenericUDTFGetSplits.java:135)
>   at 
> org.apache.hive.jdbc.AbstractTestJdbcGenericUDTFGetSplits.testGenericUDTFOrderBySplitCount1(AbstractTestJdbcGenericUDTFGetSplits.java:164)
>   at 
> org.apache.hive.jdbc.TestJdbcGenericUDTFGetSplits2.testGenericUDTFOrderBySplitCount1(TestJdbcGenericUDTFGetSplits2.java:28)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> Caused by: org.apache.thrift.TApplicationException: CloseOperation failed: 
> out of sequence response
>   at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:84)
>   at 
> org.apache.hive.service.rpc.thrift.TCLIService$Client.recv_CloseOperation(TCLIService.java:521)
>   at 
> org.apache.hive.service.rpc.thrift.TCLIService$Client.CloseOperation(TCLIService.java:508)
>   at sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at

[jira] [Work logged] (HIVE-23394) TestJdbcGenericUDTFGetSplits2#testGenericUDTFOrderBySplitCount1 is flaky

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23394?focusedWorklogId=860278=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860278
 ]

ASF GitHub Bot logged work on HIVE-23394:
-

Author: ASF GitHub Bot
Created on: 03/May/23 10:41
Start Date: 03/May/23 10:41
Worklog Time Spent: 10m 
  Work Description: simhadri-g commented on code in PR #4249:
URL: https://github.com/apache/hive/pull/4249#discussion_r1183521173


##
itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcGenericUDTFGetSplits.java:
##
@@ -38,14 +38,16 @@
 public class TestJdbcGenericUDTFGetSplits extends 
AbstractTestJdbcGenericUDTFGetSplits {
 
   @Test(timeout = 20)
-  @Ignore("HIVE-23394")
   public void testGenericUDTFOrderBySplitCount1() throws Exception {
-super.testGenericUDTFOrderBySplitCount1("get_splits", new int[]{10, 1, 0, 
2, 2, 2, 1, 10});
+super.testGenericUDTFOrderBySplitCount1("get_splits", new int[] { 10, 5, 
0, 2, 2, 2, 5 });
+super.testGenericUDTFOrderBySplitCount1("get_llap_splits", new int[] { 12, 
7, 1, 4, 4, 4, 7 });
   }
 
+
   @Test(timeout = 20)
   public void testGenericUDTFOrderBySplitCount1OnPartitionedTable() throws 
Exception {
 super.testGenericUDTFOrderBySplitCount1OnPartitionedTable("get_splits", 
new int[]{5, 5, 1, 1, 1});
+
super.testGenericUDTFOrderBySplitCount1OnPartitionedTable("get_llap_splits", 
new int[]{7, 7, 3, 3, 3});

Review Comment:
   Since both the tests have almost the exact same code to drive the test, I 
thought it would be good to merge it into a single file to reduce code 
duplication.
   
   





Issue Time Tracking
---

Worklog Id: (was: 860278)
Time Spent: 2.5h  (was: 2h 20m)

> TestJdbcGenericUDTFGetSplits2#testGenericUDTFOrderBySplitCount1 is flaky
> 
>
> Key: HIVE-23394
> URL: https://issues.apache.org/jira/browse/HIVE-23394
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Simhadri Govindappa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> both 
> TestJdbcGenericUDTFGetSplits2.testGenericUDTFOrderBySplitCount1 and
> TestJdbcGenericUDTFGetSplits.testGenericUDTFOrderBySplitCount1
> can fail with the exception below
> seems like the connection was lost
> {code}
> Error Message
> Failed to close statement
> Stacktrace
> java.sql.SQLException: Failed to close statement
>   at 
> org.apache.hive.jdbc.HiveStatement.closeStatementIfNeeded(HiveStatement.java:200)
>   at 
> org.apache.hive.jdbc.HiveStatement.closeClientOperation(HiveStatement.java:205)
>   at org.apache.hive.jdbc.HiveStatement.close(HiveStatement.java:222)
>   at 
> org.apache.hive.jdbc.AbstractTestJdbcGenericUDTFGetSplits.runQuery(AbstractTestJdbcGenericUDTFGetSplits.java:135)
>   at 
> org.apache.hive.jdbc.AbstractTestJdbcGenericUDTFGetSplits.testGenericUDTFOrderBySplitCount1(AbstractTestJdbcGenericUDTFGetSplits.java:164)
>   at 
> org.apache.hive.jdbc.TestJdbcGenericUDTFGetSplits2.testGenericUDTFOrderBySplitCount1(TestJdbcGenericUDTFGetSplits2.java:28)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> Caused by: org.apache.thrift.TApplicationException: CloseOperation failed: 
> out of sequence response
>   at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:84)
>   at 
> org.apache.hive.service.rpc.thrift.TCLIService$Client.recv_CloseOperation(TCLIService.java:521)
>   at 
> org.apache.hive.service.rpc.thrift.TCLIService$Client.CloseOperation(TCLIService.java:508)
>   at sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hive.jdbc.HiveConnection$SynchronizedHandler.invoke(HiveConnection.java:1732)
>   at com.sun.proxy.$Proxy146.CloseOperation(Unknown

[jira] [Work logged] (HIVE-27032) Introduce liquibase for HMS schema evolution

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27032?focusedWorklogId=860274=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860274
 ]

ASF GitHub Bot logged work on HIVE-27032:
-

Author: ASF GitHub Bot
Created on: 03/May/23 10:34
Start Date: 03/May/23 10:34
Worklog Time Spent: 10m 
  Work Description: sonarcloud[bot] commented on PR #4060:
URL: https://github.com/apache/hive/pull/4060#issuecomment-1532799108

   Kudos, SonarCloud Quality Gate passed!  [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive=4060)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=4060=false=BUG)
 
[![C](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/C-16px.png
 
'C')](https://sonarcloud.io/project/issues?id=apache_hive=4060=false=BUG)
 [1 
Bug](https://sonarcloud.io/project/issues?id=apache_hive=4060=false=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=4060=false=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4060=false=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=4060=false=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4060=false=SECURITY_HOTSPOT)
 
[![E](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/E-16px.png
 
'E')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4060=false=SECURITY_HOTSPOT)
 [6 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4060=false=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive=4060=false=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4060=false=CODE_SMELL)
 [207 Code 
Smells](https://sonarcloud.io/project/issues?id=apache_hive=4060=false=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4060=coverage=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4060=duplicated_lines_density=list)
 No Duplication information
   
   




Issue Time Tracking
---

Worklog Id: (was: 860274)
Time Spent: 4h 10m  (was: 4h)

> Introduce liquibase for HMS schema evolution
> 
>
> Key: HIVE-27032
> URL: https://issues.apache.org/jira/browse/HIVE-27032
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Végh
>Assignee: László Végh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> Introduce liquibase, and replace current upgrade procedure with it.
> The Schematool CLI API should remain untouched, while under the hood, 
> liquibase should be used for HMS schema evolution.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-27234) Iceberg: CREATE BRANCH SQL implementation

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27234?focusedWorklogId=860268=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860268
 ]

ASF GitHub Bot logged work on HIVE-27234:
-

Author: ASF GitHub Bot
Created on: 03/May/23 10:21
Start Date: 03/May/23 10:21
Worklog Time Spent: 10m 
  Work Description: zhangbutao commented on code in PR #4216:
URL: https://github.com/apache/hive/pull/4216#discussion_r1183501618


##
ql/src/java/org/apache/hadoop/hive/ql/parse/AlterTableCreateBranchSpec.java:
##
@@ -0,0 +1,63 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.parse;
+
+import com.google.common.base.MoreObjects;
+
+public class AlterTableCreateBranchSpec {

Review Comment:
   That's what I thought before. Also I want to reuse method: 
`HiveIcebergStorageHandler::executeOperation` :
   
https://github.com/apache/hive/blob/e413b445e95e7c8beb4e43f0f6b08a98934e1c2c/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java#L645
   
   But it  seems to be that these codes are for  syntax `alter table execute`.  
Do you think we can reuse them for `alter table create branch
   ` or `alter table drop branch`?





Issue Time Tracking
---

Worklog Id: (was: 860268)
Time Spent: 6h  (was: 5h 50m)

> Iceberg:  CREATE BRANCH SQL implementation
> --
>
> Key: HIVE-27234
> URL: https://issues.apache.org/jira/browse/HIVE-27234
> Project: Hive
>  Issue Type: Sub-task
>  Components: Iceberg integration
>Reporter: zhangbutao
>Assignee: zhangbutao
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> Maybe we can follow spark sql about branch ddl implementation 
> [https://github.com/apache/iceberg/pull/6617]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-27234) Iceberg: CREATE BRANCH SQL implementation

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27234?focusedWorklogId=860264=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860264
 ]

ASF GitHub Bot logged work on HIVE-27234:
-

Author: ASF GitHub Bot
Created on: 03/May/23 10:11
Start Date: 03/May/23 10:11
Worklog Time Spent: 10m 
  Work Description: zhangbutao commented on code in PR #4216:
URL: https://github.com/apache/hive/pull/4216#discussion_r1183492278


##
ql/src/java/org/apache/hadoop/hive/ql/ddl/table/branch/create/AlterTableCreateBranchAnalyzer.java:
##
@@ -0,0 +1,93 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.ddl.table.branch.create;
+
+import java.util.Locale;
+import java.util.Map;
+import java.util.concurrent.TimeUnit;
+
+import org.apache.hadoop.hive.common.TableName;
+import org.apache.hadoop.hive.metastore.HiveMetaHook;
+import org.apache.hadoop.hive.ql.QueryState;
+import org.apache.hadoop.hive.ql.ddl.DDLSemanticAnalyzerFactory;
+import org.apache.hadoop.hive.ql.ddl.DDLWork;
+import org.apache.hadoop.hive.ql.ddl.table.AbstractAlterTableAnalyzer;
+import org.apache.hadoop.hive.ql.ddl.table.AlterTableType;
+import org.apache.hadoop.hive.ql.exec.TaskFactory;
+import org.apache.hadoop.hive.ql.hooks.ReadEntity;
+import org.apache.hadoop.hive.ql.metadata.Table;
+import org.apache.hadoop.hive.ql.parse.ASTNode;
+import org.apache.hadoop.hive.ql.parse.AlterTableCreateBranchSpec;
+import org.apache.hadoop.hive.ql.parse.HiveParser;
+import org.apache.hadoop.hive.ql.parse.SemanticException;
+
+@DDLSemanticAnalyzerFactory.DDLType(types = 
HiveParser.TOK_ALTERTABLE_CREATE_BRANCH)
+public class AlterTableCreateBranchAnalyzer extends AbstractAlterTableAnalyzer 
{
+
+  public AlterTableCreateBranchAnalyzer(QueryState queryState) throws 
SemanticException {
+super(queryState);
+  }
+
+  @Override
+  protected void analyzeCommand(TableName tableName, Map 
partitionSpec, ASTNode command)
+  throws SemanticException {
+Table table = getTable(tableName);
+validateAlterTableType(table, AlterTableType.CREATEBRANCH, false);
+if 
(!HiveMetaHook.ICEBERG.equalsIgnoreCase(table.getParameters().get(HiveMetaHook.TABLE_TYPE)))
 {

Review Comment:
   Sorry, can you elaborate on where this logic shoud move into? This is just a 
simple iceberg check  statement.





Issue Time Tracking
---

Worklog Id: (was: 860264)
Time Spent: 5h 50m  (was: 5h 40m)

> Iceberg:  CREATE BRANCH SQL implementation
> --
>
> Key: HIVE-27234
> URL: https://issues.apache.org/jira/browse/HIVE-27234
> Project: Hive
>  Issue Type: Sub-task
>  Components: Iceberg integration
>Reporter: zhangbutao
>Assignee: zhangbutao
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> Maybe we can follow spark sql about branch ddl implementation 
> [https://github.com/apache/iceberg/pull/6617]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-27112) implement array_except UDF in Hive

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27112?focusedWorklogId=860262=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860262
 ]

ASF GitHub Bot logged work on HIVE-27112:
-

Author: ASF GitHub Bot
Created on: 03/May/23 10:08
Start Date: 03/May/23 10:08
Worklog Time Spent: 10m 
  Work Description: tarak271 commented on PR #4090:
URL: https://github.com/apache/hive/pull/4090#issuecomment-1532766717

   > 
   
   Yes @saihemanth-cloudera, need some help with review. Coding part is 
complete 




Issue Time Tracking
---

Worklog Id: (was: 860262)
Time Spent: 1.5h  (was: 1h 20m)

> implement array_except UDF in Hive
> --
>
> Key: HIVE-27112
> URL: https://issues.apache.org/jira/browse/HIVE-27112
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Taraka Rama Rao Lethavadla
>Assignee: Taraka Rama Rao Lethavadla
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> *array_except(array1, array2)* 
> Returns an array of the elements in {{array1}} but not in {{array2, without 
> duplicates.}}
>  
> {noformat}
> > SELECT array_except(array(1, 2, 2, 3), array(1, 1, 3, 5));
> [2]
> {noformat}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-27234) Iceberg: CREATE BRANCH SQL implementation

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27234?focusedWorklogId=860260=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860260
 ]

ASF GitHub Bot logged work on HIVE-27234:
-

Author: ASF GitHub Bot
Created on: 03/May/23 10:07
Start Date: 03/May/23 10:07
Worklog Time Spent: 10m 
  Work Description: zhangbutao commented on code in PR #4216:
URL: https://github.com/apache/hive/pull/4216#discussion_r1183488456


##
parser/src/java/org/apache/hadoop/hive/ql/parse/AlterClauseParser.g:
##
@@ -477,6 +478,34 @@ alterStatementSuffixExecute
 -> ^(TOK_ALTERTABLE_EXECUTE KW_SET_CURRENT_SNAPSHOT $snapshotParam)
 ;
 
+alterStatementSuffixCreateBranch
+@init { gParent.pushMsg("alter table create branch", state); }
+@after { gParent.popMsg(state); }
+: KW_CREATE KW_BRANCH branchName=identifier snapshotIdOfBranch? 
branchRetain? retentionOfSnapshots?
+-> ^(TOK_ALTERTABLE_CREATE_BRANCH $branchName snapshotIdOfBranch? 
branchRetain? retentionOfSnapshots?)
+;
+
+snapshotIdOfBranch
+@init { gParent.pushMsg("alter table create branch as of version", state); }
+@after { gParent.popMsg(state); }
+: KW_AS KW_OF KW_VERSION snapshotId=Number
+-> ^(TOK_AS_OF_VERSION_BRANCH $snapshotId)

Review Comment:
   sure, as mentioned above, we can reuse `TOK_AS_OF_VERSION` if we use `FOR 
SYSTEM_VERSION AS OF` instead of `AS OF VERSION`.





Issue Time Tracking
---

Worklog Id: (was: 860260)
Time Spent: 5.5h  (was: 5h 20m)

> Iceberg:  CREATE BRANCH SQL implementation
> --
>
> Key: HIVE-27234
> URL: https://issues.apache.org/jira/browse/HIVE-27234
> Project: Hive
>  Issue Type: Sub-task
>  Components: Iceberg integration
>Reporter: zhangbutao
>Assignee: zhangbutao
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> Maybe we can follow spark sql about branch ddl implementation 
> [https://github.com/apache/iceberg/pull/6617]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-27234) Iceberg: CREATE BRANCH SQL implementation

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27234?focusedWorklogId=860261=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860261
 ]

ASF GitHub Bot logged work on HIVE-27234:
-

Author: ASF GitHub Bot
Created on: 03/May/23 10:07
Start Date: 03/May/23 10:07
Worklog Time Spent: 10m 
  Work Description: zhangbutao commented on code in PR #4216:
URL: https://github.com/apache/hive/pull/4216#discussion_r1183488456


##
parser/src/java/org/apache/hadoop/hive/ql/parse/AlterClauseParser.g:
##
@@ -477,6 +478,34 @@ alterStatementSuffixExecute
 -> ^(TOK_ALTERTABLE_EXECUTE KW_SET_CURRENT_SNAPSHOT $snapshotParam)
 ;
 
+alterStatementSuffixCreateBranch
+@init { gParent.pushMsg("alter table create branch", state); }
+@after { gParent.popMsg(state); }
+: KW_CREATE KW_BRANCH branchName=identifier snapshotIdOfBranch? 
branchRetain? retentionOfSnapshots?
+-> ^(TOK_ALTERTABLE_CREATE_BRANCH $branchName snapshotIdOfBranch? 
branchRetain? retentionOfSnapshots?)
+;
+
+snapshotIdOfBranch
+@init { gParent.pushMsg("alter table create branch as of version", state); }
+@after { gParent.popMsg(state); }
+: KW_AS KW_OF KW_VERSION snapshotId=Number
+-> ^(TOK_AS_OF_VERSION_BRANCH $snapshotId)

Review Comment:
   As mentioned above, we can reuse `TOK_AS_OF_VERSION` if we use `FOR 
SYSTEM_VERSION AS OF` instead of `AS OF VERSION`.





Issue Time Tracking
---

Worklog Id: (was: 860261)
Time Spent: 5h 40m  (was: 5.5h)

> Iceberg:  CREATE BRANCH SQL implementation
> --
>
> Key: HIVE-27234
> URL: https://issues.apache.org/jira/browse/HIVE-27234
> Project: Hive
>  Issue Type: Sub-task
>  Components: Iceberg integration
>Reporter: zhangbutao
>Assignee: zhangbutao
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> Maybe we can follow spark sql about branch ddl implementation 
> [https://github.com/apache/iceberg/pull/6617]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-27234) Iceberg: CREATE BRANCH SQL implementation

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27234?focusedWorklogId=860258=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860258
 ]

ASF GitHub Bot logged work on HIVE-27234:
-

Author: ASF GitHub Bot
Created on: 03/May/23 10:04
Start Date: 03/May/23 10:04
Worklog Time Spent: 10m 
  Work Description: zhangbutao commented on code in PR #4216:
URL: https://github.com/apache/hive/pull/4216#discussion_r1183485755


##
parser/src/java/org/apache/hadoop/hive/ql/parse/AlterClauseParser.g:
##
@@ -477,6 +478,34 @@ alterStatementSuffixExecute
 -> ^(TOK_ALTERTABLE_EXECUTE KW_SET_CURRENT_SNAPSHOT $snapshotParam)
 ;
 
+alterStatementSuffixCreateBranch
+@init { gParent.pushMsg("alter table create branch", state); }
+@after { gParent.popMsg(state); }
+: KW_CREATE KW_BRANCH branchName=identifier snapshotIdOfBranch? 
branchRetain? retentionOfSnapshots?
+-> ^(TOK_ALTERTABLE_CREATE_BRANCH $branchName snapshotIdOfBranch? 
branchRetain? retentionOfSnapshots?)
+;
+
+snapshotIdOfBranch
+@init { gParent.pushMsg("alter table create branch as of version", state); }

Review Comment:
   My original thought was to be consistent with spark-iceberg syntax. But it 
seems more reasonable to be sync with current hive` timet_travel` syntax.
   I will be change the syntax and add the syntax `FOR SYSTEM_TIME AS OF`.
   
   The new syntax will be as follows:
   
   ```
   ALTER TABLE tableName
   {CREATE BRANCH branchName [FOR SYSTEM_VERSION AS OF {snapshotId} | FOR 
SYSTEM_TIME AS OF {timestamp}]
   [RETAIN interval {DAYS | HOURS | MINUTES}]
   [WITH SNAPSHOT RETENTION {[num_snapshots SNAPSHOTS] [interval {DAYS | HOURS 
| MINUTES}]}]}]
   ```





Issue Time Tracking
---

Worklog Id: (was: 860258)
Time Spent: 5h 20m  (was: 5h 10m)

> Iceberg:  CREATE BRANCH SQL implementation
> --
>
> Key: HIVE-27234
> URL: https://issues.apache.org/jira/browse/HIVE-27234
> Project: Hive
>  Issue Type: Sub-task
>  Components: Iceberg integration
>Reporter: zhangbutao
>Assignee: zhangbutao
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> Maybe we can follow spark sql about branch ddl implementation 
> [https://github.com/apache/iceberg/pull/6617]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-24515) Analyze table job can be skipped when stats populated are already accurate

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24515?focusedWorklogId=860248=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860248
 ]

ASF GitHub Bot logged work on HIVE-24515:
-

Author: ASF GitHub Bot
Created on: 03/May/23 09:13
Start Date: 03/May/23 09:13
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #1834:
URL: https://github.com/apache/hive/pull/1834#discussion_r570556176


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:
##
@@ -4797,6 +4800,84 @@ private void heartbeatTxn(Connection dbConn, long txnid)
 }
   }
 
+  private boolean foundCommittedTransaction(Connection dbConn, long txnId, 
FindStatStatusByWriteIdRequest rqst,
+   String condition) throws 
SQLException, MetaException {
+String s = sqlGenerator.addLimitClause(1,
+"1 FROM \"COMPLETED_TXN_COMPONENTS\" WHERE \"CTC_TXNID\" " + 
condition + " " + txnId +
+" AND \"CTC_DATABASE\" = ? AND \"CTC_TABLE\" = ?");

Review Comment:
   You do not need db and table filters as txnid is unique across them, see 
findTxnState(). Same query is duplicated in many other places like 
getCommittedTxns(), see if you could extract it. 





Issue Time Tracking
---

Worklog Id: (was: 860248)
Time Spent: 3h 10m  (was: 3h)

> Analyze table job can be skipped when stats populated are already accurate
> --
>
> Key: HIVE-24515
> URL: https://issues.apache.org/jira/browse/HIVE-24515
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> For non-partitioned tables, stats detail should be present in table level,
> e.g
> {noformat}
> COLUMN_STATS_ACCURATE={"BASIC_STATS":"true","COLUMN_STATS":{"d_current_day":"true"...
>  }}
>   {noformat}
> For partitioned tables, stats detail should be present in partition level,
> {noformat}
> store_sales(ss_sold_date_sk=2451819)
> {totalSize=0, numRows=0, rawDataSize=0, 
> COLUMN_STATS_ACCURATE={"BASIC_STATS":"true","COLUMN_STATS":{"ss_addr_sk":"true"}}
>  
>  {noformat}
> When stats populated are already accurate, {{analyze table tn compute 
> statistics for columns}} should skip launching the job.
>  
> For ACID tables, stats are auto computed and it can skip computing stats 
> again when stats are accurate.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-27198) Delete directly aborted transactions instead of select and loading ids

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27198?focusedWorklogId=860229=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860229
 ]

ASF GitHub Bot logged work on HIVE-27198:
-

Author: ASF GitHub Bot
Created on: 03/May/23 07:32
Start Date: 03/May/23 07:32
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #4174:
URL: https://github.com/apache/hive/pull/4174#discussion_r1183332209


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java:
##
@@ -87,6 +87,11 @@ class CompactionTxnHandler extends TxnHandler {
   "DELETE FROM \"COMPACTION_METRICS_CACHE\" WHERE \"CMC_DATABASE\" = ? AND 
\"CMC_TABLE\" = ? " +
   "AND \"CMC_METRIC_TYPE\" = ?";
 
+  private static final String DELETE_FAILED_TXNS_DIRECTLY_SQL =

Review Comment:
   move it to TxnQueries





Issue Time Tracking
---

Worklog Id: (was: 860229)
Time Spent: 3h  (was: 2h 50m)

> Delete directly aborted transactions instead of select and loading ids
> --
>
> Key: HIVE-27198
> URL: https://issues.apache.org/jira/browse/HIVE-27198
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mahesh Raju Somalaraju
>Assignee: Mahesh Raju Somalaraju
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> in cleaning the aborted transaction , we can directly deletes the txns 
> instead of selecting and process.
> method name: 
> cleanEmptyAbortedAndCommittedTxns
> Code:
> String s = "SELECT \"TXN_ID\" FROM \"TXNS\" WHERE " +
> "\"TXN_ID\" NOT IN (SELECT \"TC_TXNID\" FROM \"TXN_COMPONENTS\") AND " +
> " (\"TXN_STATE\" = " + TxnStatus.ABORTED + " OR \"TXN_STATE\" = " + 
> TxnStatus.COMMITTED + ") AND "
> + " \"TXN_ID\" < " + lowWaterMark;
>  
> proposed code:
> String s = "DELETE \"TXN_ID\" FROM \"TXNS\" WHERE " +
> "\"TXN_ID\" NOT IN (SELECT \"TC_TXNID\" FROM \"TXN_COMPONENTS\") AND " +
> " (\"TXN_STATE\" = " + TxnStatus.ABORTED + " OR \"TXN_STATE\" = " + 
> TxnStatus.COMMITTED + ") AND "
> + " \"TXN_ID\" < " + lowWaterMark;
>  
> the select needs to be eliminated and the delete should work with the where 
> clause instead of the built in clause
> we can see no reason for loading the ids into memory and then generate a huge 
> sql
>  
> Bathcing is also not necessary here, we can deletes the records directly



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-27198) Delete directly aborted transactions instead of select and loading ids

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27198?focusedWorklogId=860227=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860227
 ]

ASF GitHub Bot logged work on HIVE-27198:
-

Author: ASF GitHub Bot
Created on: 03/May/23 07:31
Start Date: 03/May/23 07:31
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #4174:
URL: https://github.com/apache/hive/pull/4174#discussion_r1183331418


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java:
##
@@ -926,10 +900,8 @@ public void cleanEmptyAbortedAndCommittedTxns() throws 
MetaException {
 checkRetryable(e, "cleanEmptyAbortedTxns");
 throw new MetaException("Unable to connect to transaction database " +
   e.getMessage());
-  } finally {
-close(rs, stmt, dbConn);
   }
-} catch (RetryException e) {
+} catch (RetryException | SQLException e) {

Review Comment:
   SQLException from getDbConn() is not checked for being Retryable





Issue Time Tracking
---

Worklog Id: (was: 860227)
Time Spent: 2h 50m  (was: 2h 40m)

> Delete directly aborted transactions instead of select and loading ids
> --
>
> Key: HIVE-27198
> URL: https://issues.apache.org/jira/browse/HIVE-27198
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mahesh Raju Somalaraju
>Assignee: Mahesh Raju Somalaraju
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> in cleaning the aborted transaction , we can directly deletes the txns 
> instead of selecting and process.
> method name: 
> cleanEmptyAbortedAndCommittedTxns
> Code:
> String s = "SELECT \"TXN_ID\" FROM \"TXNS\" WHERE " +
> "\"TXN_ID\" NOT IN (SELECT \"TC_TXNID\" FROM \"TXN_COMPONENTS\") AND " +
> " (\"TXN_STATE\" = " + TxnStatus.ABORTED + " OR \"TXN_STATE\" = " + 
> TxnStatus.COMMITTED + ") AND "
> + " \"TXN_ID\" < " + lowWaterMark;
>  
> proposed code:
> String s = "DELETE \"TXN_ID\" FROM \"TXNS\" WHERE " +
> "\"TXN_ID\" NOT IN (SELECT \"TC_TXNID\" FROM \"TXN_COMPONENTS\") AND " +
> " (\"TXN_STATE\" = " + TxnStatus.ABORTED + " OR \"TXN_STATE\" = " + 
> TxnStatus.COMMITTED + ") AND "
> + " \"TXN_ID\" < " + lowWaterMark;
>  
> the select needs to be eliminated and the delete should work with the where 
> clause instead of the built in clause
> we can see no reason for loading the ids into memory and then generate a huge 
> sql
>  
> Bathcing is also not necessary here, we can deletes the records directly



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-27198) Delete directly aborted transactions instead of select and loading ids

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27198?focusedWorklogId=860224=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860224
 ]

ASF GitHub Bot logged work on HIVE-27198:
-

Author: ASF GitHub Bot
Created on: 03/May/23 07:23
Start Date: 03/May/23 07:23
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #4174:
URL: https://github.com/apache/hive/pull/4174#discussion_r1183324488


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java:
##
@@ -872,51 +877,20 @@ public void removeDuplicateCompletedTxnComponents() 
throws MetaException {
   @RetrySemantics.SafeToRetry
   public void cleanEmptyAbortedAndCommittedTxns() throws MetaException {
 LOG.info("Start to clean empty aborted or committed TXNS");
-try {
-  Connection dbConn = null;
-  Statement stmt = null;
-  ResultSet rs = null;
-  try {
+try (Connection dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED, 
connPoolCompaction)) {
+  try (Statement stmt = dbConn.createStatement()) {

Review Comment:
   please use prepared statement





Issue Time Tracking
---

Worklog Id: (was: 860224)
Time Spent: 2h 40m  (was: 2.5h)

> Delete directly aborted transactions instead of select and loading ids
> --
>
> Key: HIVE-27198
> URL: https://issues.apache.org/jira/browse/HIVE-27198
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mahesh Raju Somalaraju
>Assignee: Mahesh Raju Somalaraju
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> in cleaning the aborted transaction , we can directly deletes the txns 
> instead of selecting and process.
> method name: 
> cleanEmptyAbortedAndCommittedTxns
> Code:
> String s = "SELECT \"TXN_ID\" FROM \"TXNS\" WHERE " +
> "\"TXN_ID\" NOT IN (SELECT \"TC_TXNID\" FROM \"TXN_COMPONENTS\") AND " +
> " (\"TXN_STATE\" = " + TxnStatus.ABORTED + " OR \"TXN_STATE\" = " + 
> TxnStatus.COMMITTED + ") AND "
> + " \"TXN_ID\" < " + lowWaterMark;
>  
> proposed code:
> String s = "DELETE \"TXN_ID\" FROM \"TXNS\" WHERE " +
> "\"TXN_ID\" NOT IN (SELECT \"TC_TXNID\" FROM \"TXN_COMPONENTS\") AND " +
> " (\"TXN_STATE\" = " + TxnStatus.ABORTED + " OR \"TXN_STATE\" = " + 
> TxnStatus.COMMITTED + ") AND "
> + " \"TXN_ID\" < " + lowWaterMark;
>  
> the select needs to be eliminated and the delete should work with the where 
> clause instead of the built in clause
> we can see no reason for loading the ids into memory and then generate a huge 
> sql
>  
> Bathcing is also not necessary here, we can deletes the records directly



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-23394) TestJdbcGenericUDTFGetSplits2#testGenericUDTFOrderBySplitCount1 is flaky

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23394?focusedWorklogId=860223=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860223
 ]

ASF GitHub Bot logged work on HIVE-23394:
-

Author: ASF GitHub Bot
Created on: 03/May/23 07:15
Start Date: 03/May/23 07:15
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #4249:
URL: https://github.com/apache/hive/pull/4249#discussion_r1183317557


##
itests/hive-unit/src/test/java/org/apache/hive/jdbc/AbstractTestJdbcGenericUDTFGetSplits.java:
##
@@ -179,15 +176,9 @@ protected void testGenericUDTFOrderBySplitCount1(String 
udtfName, int[] expected
 query = "select " + udtfName + "(" + "'select value from " + tableName + " 
where value is not null limit 2', 5)";
 runQuery(query, getConfigs(), expectedCounts[5]);
 
-query = "select " + udtfName + "(" +
-"'select `value` from (select value from " + tableName + " where value 
is not null order by value) as t', 5)";
+query = "select " + udtfName + "(" + "'select `value` from (select value 
from " + tableName +
+" where value is not null order by value) as t', 5)";
 runQuery(query, getConfigs(), expectedCounts[6]);
-
-List setCmds = getConfigs();
-setCmds.add("set 
hive.llap.external.splits.order.by.force.single.split=false");
-query = "select " + udtfName + "(" +
-"'select `value` from (select value from " + tableName + " where value 
is not null order by value) as t', 10)";
-runQuery(query, setCmds, expectedCounts[7]);

Review Comment:
   testGenericUDTFOrderBySplitCount1 now uses only 6 values from 
expectedCounts update the test inputs





Issue Time Tracking
---

Worklog Id: (was: 860223)
Time Spent: 2h 20m  (was: 2h 10m)

> TestJdbcGenericUDTFGetSplits2#testGenericUDTFOrderBySplitCount1 is flaky
> 
>
> Key: HIVE-23394
> URL: https://issues.apache.org/jira/browse/HIVE-23394
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Simhadri Govindappa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> both 
> TestJdbcGenericUDTFGetSplits2.testGenericUDTFOrderBySplitCount1 and
> TestJdbcGenericUDTFGetSplits.testGenericUDTFOrderBySplitCount1
> can fail with the exception below
> seems like the connection was lost
> {code}
> Error Message
> Failed to close statement
> Stacktrace
> java.sql.SQLException: Failed to close statement
>   at 
> org.apache.hive.jdbc.HiveStatement.closeStatementIfNeeded(HiveStatement.java:200)
>   at 
> org.apache.hive.jdbc.HiveStatement.closeClientOperation(HiveStatement.java:205)
>   at org.apache.hive.jdbc.HiveStatement.close(HiveStatement.java:222)
>   at 
> org.apache.hive.jdbc.AbstractTestJdbcGenericUDTFGetSplits.runQuery(AbstractTestJdbcGenericUDTFGetSplits.java:135)
>   at 
> org.apache.hive.jdbc.AbstractTestJdbcGenericUDTFGetSplits.testGenericUDTFOrderBySplitCount1(AbstractTestJdbcGenericUDTFGetSplits.java:164)
>   at 
> org.apache.hive.jdbc.TestJdbcGenericUDTFGetSplits2.testGenericUDTFOrderBySplitCount1(TestJdbcGenericUDTFGetSplits2.java:28)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> Caused by: org.apache.thrift.TApplicationException: CloseOperation failed: 
> out of sequence response
>   at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:84)
>   at 
> org.apache.hive.service.rpc.thrift.TCLIService$Client.recv_CloseOperation(TCLIService.java:521)
>   at 
> org.apache.hive.service.rpc.thrift.TCLIService$Client.CloseOperation(TCLIService.java:508)
>   at sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
>

[jira] [Work logged] (HIVE-23394) TestJdbcGenericUDTFGetSplits2#testGenericUDTFOrderBySplitCount1 is flaky

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23394?focusedWorklogId=860219=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860219
 ]

ASF GitHub Bot logged work on HIVE-23394:
-

Author: ASF GitHub Bot
Created on: 03/May/23 07:08
Start Date: 03/May/23 07:08
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #4249:
URL: https://github.com/apache/hive/pull/4249#discussion_r1183305035


##
itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcGenericUDTFGetSplits.java:
##
@@ -38,9 +38,13 @@
 public class TestJdbcGenericUDTFGetSplits extends 
AbstractTestJdbcGenericUDTFGetSplits {
 
   @Test(timeout = 20)
-  @Ignore("HIVE-23394")
   public void testGenericUDTFOrderBySplitCount1() throws Exception {
-super.testGenericUDTFOrderBySplitCount1("get_splits", new int[]{10, 1, 0, 
2, 2, 2, 1, 10});
+super.testGenericUDTFOrderBySplitCount1("get_splits", new int[] { 10, 5, 0 
});

Review Comment:
   +1





Issue Time Tracking
---

Worklog Id: (was: 860219)
Time Spent: 2h 10m  (was: 2h)

> TestJdbcGenericUDTFGetSplits2#testGenericUDTFOrderBySplitCount1 is flaky
> 
>
> Key: HIVE-23394
> URL: https://issues.apache.org/jira/browse/HIVE-23394
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Simhadri Govindappa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> both 
> TestJdbcGenericUDTFGetSplits2.testGenericUDTFOrderBySplitCount1 and
> TestJdbcGenericUDTFGetSplits.testGenericUDTFOrderBySplitCount1
> can fail with the exception below
> seems like the connection was lost
> {code}
> Error Message
> Failed to close statement
> Stacktrace
> java.sql.SQLException: Failed to close statement
>   at 
> org.apache.hive.jdbc.HiveStatement.closeStatementIfNeeded(HiveStatement.java:200)
>   at 
> org.apache.hive.jdbc.HiveStatement.closeClientOperation(HiveStatement.java:205)
>   at org.apache.hive.jdbc.HiveStatement.close(HiveStatement.java:222)
>   at 
> org.apache.hive.jdbc.AbstractTestJdbcGenericUDTFGetSplits.runQuery(AbstractTestJdbcGenericUDTFGetSplits.java:135)
>   at 
> org.apache.hive.jdbc.AbstractTestJdbcGenericUDTFGetSplits.testGenericUDTFOrderBySplitCount1(AbstractTestJdbcGenericUDTFGetSplits.java:164)
>   at 
> org.apache.hive.jdbc.TestJdbcGenericUDTFGetSplits2.testGenericUDTFOrderBySplitCount1(TestJdbcGenericUDTFGetSplits2.java:28)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> Caused by: org.apache.thrift.TApplicationException: CloseOperation failed: 
> out of sequence response
>   at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:84)
>   at 
> org.apache.hive.service.rpc.thrift.TCLIService$Client.recv_CloseOperation(TCLIService.java:521)
>   at 
> org.apache.hive.service.rpc.thrift.TCLIService$Client.CloseOperation(TCLIService.java:508)
>   at sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hive.jdbc.HiveConnection$SynchronizedHandler.invoke(HiveConnection.java:1732)
>   at com.sun.proxy.$Proxy146.CloseOperation(Unknown Source)
>   at 
> org.apache.hive.jdbc.HiveStatement.closeStatementIfNeeded(HiveStatement.java:193)
>   ... 14 more
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-23394) TestJdbcGenericUDTFGetSplits2#testGenericUDTFOrderBySplitCount1 is flaky

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23394?focusedWorklogId=860218=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860218
 ]

ASF GitHub Bot logged work on HIVE-23394:
-

Author: ASF GitHub Bot
Created on: 03/May/23 07:00
Start Date: 03/May/23 07:00
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #4249:
URL: https://github.com/apache/hive/pull/4249#discussion_r1183305035


##
itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcGenericUDTFGetSplits.java:
##
@@ -38,9 +38,13 @@
 public class TestJdbcGenericUDTFGetSplits extends 
AbstractTestJdbcGenericUDTFGetSplits {
 
   @Test(timeout = 20)
-  @Ignore("HIVE-23394")
   public void testGenericUDTFOrderBySplitCount1() throws Exception {
-super.testGenericUDTFOrderBySplitCount1("get_splits", new int[]{10, 1, 0, 
2, 2, 2, 1, 10});
+super.testGenericUDTFOrderBySplitCount1("get_splits", new int[] { 10, 5, 0 
});

Review Comment:
   +1





Issue Time Tracking
---

Worklog Id: (was: 860218)
Time Spent: 2h  (was: 1h 50m)

> TestJdbcGenericUDTFGetSplits2#testGenericUDTFOrderBySplitCount1 is flaky
> 
>
> Key: HIVE-23394
> URL: https://issues.apache.org/jira/browse/HIVE-23394
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Simhadri Govindappa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> both 
> TestJdbcGenericUDTFGetSplits2.testGenericUDTFOrderBySplitCount1 and
> TestJdbcGenericUDTFGetSplits.testGenericUDTFOrderBySplitCount1
> can fail with the exception below
> seems like the connection was lost
> {code}
> Error Message
> Failed to close statement
> Stacktrace
> java.sql.SQLException: Failed to close statement
>   at 
> org.apache.hive.jdbc.HiveStatement.closeStatementIfNeeded(HiveStatement.java:200)
>   at 
> org.apache.hive.jdbc.HiveStatement.closeClientOperation(HiveStatement.java:205)
>   at org.apache.hive.jdbc.HiveStatement.close(HiveStatement.java:222)
>   at 
> org.apache.hive.jdbc.AbstractTestJdbcGenericUDTFGetSplits.runQuery(AbstractTestJdbcGenericUDTFGetSplits.java:135)
>   at 
> org.apache.hive.jdbc.AbstractTestJdbcGenericUDTFGetSplits.testGenericUDTFOrderBySplitCount1(AbstractTestJdbcGenericUDTFGetSplits.java:164)
>   at 
> org.apache.hive.jdbc.TestJdbcGenericUDTFGetSplits2.testGenericUDTFOrderBySplitCount1(TestJdbcGenericUDTFGetSplits2.java:28)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> Caused by: org.apache.thrift.TApplicationException: CloseOperation failed: 
> out of sequence response
>   at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:84)
>   at 
> org.apache.hive.service.rpc.thrift.TCLIService$Client.recv_CloseOperation(TCLIService.java:521)
>   at 
> org.apache.hive.service.rpc.thrift.TCLIService$Client.CloseOperation(TCLIService.java:508)
>   at sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hive.jdbc.HiveConnection$SynchronizedHandler.invoke(HiveConnection.java:1732)
>   at com.sun.proxy.$Proxy146.CloseOperation(Unknown Source)
>   at 
> org.apache.hive.jdbc.HiveStatement.closeStatementIfNeeded(HiveStatement.java:193)
>   ... 14 more
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-27310) Upgrade jQuery to 3.5.1 in HIVE in hive/llap-server/src/main/resources/hive-webapps/llap/js/

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27310?focusedWorklogId=860217=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860217
 ]

ASF GitHub Bot logged work on HIVE-27310:
-

Author: ASF GitHub Bot
Created on: 03/May/23 07:00
Start Date: 03/May/23 07:00
Worklog Time Spent: 10m 
  Work Description: devaspatikrishnatri commented on PR #4281:
URL: https://github.com/apache/hive/pull/4281#issuecomment-1532544866

   Have used jQuery 3.6.4 and the build and tests have seemed to pass.
   
   I think this upgrade can be done.




Issue Time Tracking
---

Worklog Id: (was: 860217)
Time Spent: 50m  (was: 40m)

> Upgrade jQuery to 3.5.1 in HIVE in 
> hive/llap-server/src/main/resources/hive-webapps/llap/js/
> 
>
> Key: HIVE-27310
> URL: https://issues.apache.org/jira/browse/HIVE-27310
> Project: Hive
>  Issue Type: Task
>Reporter: Devaspati Krishnatri
>Assignee: Devaspati Krishnatri
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-23394) TestJdbcGenericUDTFGetSplits2#testGenericUDTFOrderBySplitCount1 is flaky

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23394?focusedWorklogId=860215=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860215
 ]

ASF GitHub Bot logged work on HIVE-23394:
-

Author: ASF GitHub Bot
Created on: 03/May/23 06:57
Start Date: 03/May/23 06:57
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #4249:
URL: https://github.com/apache/hive/pull/4249#discussion_r1183302163


##
itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcGenericUDTFGetSplits.java:
##
@@ -38,14 +38,16 @@
 public class TestJdbcGenericUDTFGetSplits extends 
AbstractTestJdbcGenericUDTFGetSplits {
 
   @Test(timeout = 20)
-  @Ignore("HIVE-23394")
   public void testGenericUDTFOrderBySplitCount1() throws Exception {
-super.testGenericUDTFOrderBySplitCount1("get_splits", new int[]{10, 1, 0, 
2, 2, 2, 1, 10});
+super.testGenericUDTFOrderBySplitCount1("get_splits", new int[] { 10, 5, 
0, 2, 2, 2, 5 });
+super.testGenericUDTFOrderBySplitCount1("get_llap_splits", new int[] { 12, 
7, 1, 4, 4, 4, 7 });
   }
 
+
   @Test(timeout = 20)
   public void testGenericUDTFOrderBySplitCount1OnPartitionedTable() throws 
Exception {
 super.testGenericUDTFOrderBySplitCount1OnPartitionedTable("get_splits", 
new int[]{5, 5, 1, 1, 1});
+
super.testGenericUDTFOrderBySplitCount1OnPartitionedTable("get_llap_splits", 
new int[]{7, 7, 3, 3, 3});

Review Comment:
   why did we remove TestJdbcGenericUDTFGetSplits2 and merged both udfs under 
the same test?





Issue Time Tracking
---

Worklog Id: (was: 860215)
Time Spent: 1h 50m  (was: 1h 40m)

> TestJdbcGenericUDTFGetSplits2#testGenericUDTFOrderBySplitCount1 is flaky
> 
>
> Key: HIVE-23394
> URL: https://issues.apache.org/jira/browse/HIVE-23394
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Simhadri Govindappa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> both 
> TestJdbcGenericUDTFGetSplits2.testGenericUDTFOrderBySplitCount1 and
> TestJdbcGenericUDTFGetSplits.testGenericUDTFOrderBySplitCount1
> can fail with the exception below
> seems like the connection was lost
> {code}
> Error Message
> Failed to close statement
> Stacktrace
> java.sql.SQLException: Failed to close statement
>   at 
> org.apache.hive.jdbc.HiveStatement.closeStatementIfNeeded(HiveStatement.java:200)
>   at 
> org.apache.hive.jdbc.HiveStatement.closeClientOperation(HiveStatement.java:205)
>   at org.apache.hive.jdbc.HiveStatement.close(HiveStatement.java:222)
>   at 
> org.apache.hive.jdbc.AbstractTestJdbcGenericUDTFGetSplits.runQuery(AbstractTestJdbcGenericUDTFGetSplits.java:135)
>   at 
> org.apache.hive.jdbc.AbstractTestJdbcGenericUDTFGetSplits.testGenericUDTFOrderBySplitCount1(AbstractTestJdbcGenericUDTFGetSplits.java:164)
>   at 
> org.apache.hive.jdbc.TestJdbcGenericUDTFGetSplits2.testGenericUDTFOrderBySplitCount1(TestJdbcGenericUDTFGetSplits2.java:28)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> Caused by: org.apache.thrift.TApplicationException: CloseOperation failed: 
> out of sequence response
>   at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:84)
>   at 
> org.apache.hive.service.rpc.thrift.TCLIService$Client.recv_CloseOperation(TCLIService.java:521)
>   at 
> org.apache.hive.service.rpc.thrift.TCLIService$Client.CloseOperation(TCLIService.java:508)
>   at sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hive.jdbc.HiveConnection$SynchronizedHandler.invoke(HiveConnection.java:1732)
>   at com.sun.proxy.$Proxy146.CloseOperation(Unknown Source)
>   at 
>

[jira] [Resolved] (HIVE-27172) Add the HMS client connection timeout config

2023-05-03 Thread Denys Kuzmenko (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denys Kuzmenko resolved HIVE-27172.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

> Add the HMS client connection timeout config
> 
>
> Key: HIVE-27172
> URL: https://issues.apache.org/jira/browse/HIVE-27172
> Project: Hive
>  Issue Type: Task
>  Components: Hive
>Reporter: Wechar
>Assignee: Wechar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> Currently {{HiveMetastoreClient}} use {{CLIENT_SOCKET_TIMEOUT}} as both 
> socket timeout and connection timeout, it's not convenient for users to set a 
> smaller connection timeout.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HIVE-27172) Add the HMS client connection timeout config

2023-05-03 Thread Denys Kuzmenko (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-27172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17718805#comment-17718805
 ] 

Denys Kuzmenko commented on HIVE-27172:
---

Merged to master.
[~wechar], thanks for the contribution and [~pancheng] for the review!

> Add the HMS client connection timeout config
> 
>
> Key: HIVE-27172
> URL: https://issues.apache.org/jira/browse/HIVE-27172
> Project: Hive
>  Issue Type: Task
>  Components: Hive
>Reporter: Wechar
>Assignee: Wechar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> Currently {{HiveMetastoreClient}} use {{CLIENT_SOCKET_TIMEOUT}} as both 
> socket timeout and connection timeout, it's not convenient for users to set a 
> smaller connection timeout.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-27172) Add the HMS client connection timeout config

2023-05-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27172?focusedWorklogId=860213=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-860213
 ]

ASF GitHub Bot logged work on HIVE-27172:
-

Author: ASF GitHub Bot
Created on: 03/May/23 06:45
Start Date: 03/May/23 06:45
Worklog Time Spent: 10m 
  Work Description: deniskuzZ merged PR #4150:
URL: https://github.com/apache/hive/pull/4150




Issue Time Tracking
---

Worklog Id: (was: 860213)
Time Spent: 3h 10m  (was: 3h)

> Add the HMS client connection timeout config
> 
>
> Key: HIVE-27172
> URL: https://issues.apache.org/jira/browse/HIVE-27172
> Project: Hive
>  Issue Type: Task
>  Components: Hive
>Reporter: Wechar
>Assignee: Wechar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> Currently {{HiveMetastoreClient}} use {{CLIENT_SOCKET_TIMEOUT}} as both 
> socket timeout and connection timeout, it's not convenient for users to set a 
> smaller connection timeout.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

76 matches

Mail list logo