[jira] [Assigned] (HIVE-27195) Drop table if Exists . fails during authorization for temporary tables

2023-03-29 Thread Riju Trivedi (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Riju Trivedi reassigned HIVE-27195:
---


> Drop table if Exists . fails during authorization for 
> temporary tables
> ---
>
> Key: HIVE-27195
> URL: https://issues.apache.org/jira/browse/HIVE-27195
> Project: Hive
>  Issue Type: Bug
>Reporter: Riju Trivedi
>Assignee: Riju Trivedi
>Priority: Major
>
> https://issues.apache.org/jira/browse/HIVE-20051 handles skipping 
> authorization for temporary tables. But still, the drop table if Exists fails 
> with  HiveAccessControlException.
> Steps to Repro:
> {code:java}
> use test; CREATE TEMPORARY TABLE temp_table (id int);
> drop table if exists test.temp_table;
> Error: Error while compiling statement: FAILED: HiveAccessControlException 
> Permission denied: user [rtrivedi] does not have [DROP] privilege on 
> [test/temp_table] (state=42000,code=4) {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-27179) HS2 WebUI throws NPE when JspFactory loaded from jetty-runner

2023-03-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27179?focusedWorklogId=853828=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853828
 ]

ASF GitHub Bot logged work on HIVE-27179:
-

Author: ASF GitHub Bot
Created on: 30/Mar/23 04:49
Start Date: 30/Mar/23 04:49
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 commented on code in PR #4164:
URL: https://github.com/apache/hive/pull/4164#discussion_r1152729260


##
service/src/java/org/apache/hive/service/server/HiveServer2.java:
##
@@ -133,6 +133,8 @@
 import com.google.common.util.concurrent.SettableFuture;
 import com.google.common.util.concurrent.ThreadFactoryBuilder;
 
+import javax.servlet.jsp.JspFactory;

Review Comment:
   Hi @saihemanth-cloudera, the jetty-runner jar contains the classes:
   ```
   [Loaded javax.servlet.jsp.JspFactory from 
file:/opt/xxx/jetty-runner-9.4.48.v20220622.jar]
   [Loaded javax.servlet.jsp.JspContext from 
file:/opt/xxx/jetty-runner-9.4.48.v20220622.jar]
   [Loaded javax.servlet.jsp.PageContext from 
file:/opt/xxx/jetty-runner-9.4.48.v20220622.jar]
   ```
   In fact this is where the class conflict happens.
   





Issue Time Tracking
---

Worklog Id: (was: 853828)
Time Spent: 50m  (was: 40m)

> HS2 WebUI throws NPE when JspFactory loaded from jetty-runner
> -
>
> Key: HIVE-27179
> URL: https://issues.apache.org/jira/browse/HIVE-27179
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> In HIVE-17088{*},{*} we resolved a NPE thrown from HS2 WebUI by introducing 
> javax.servlet.jsp-api. It works as expected when the javax.servlet.jsp-api 
> jar prevails jetty-runner jar, but things can be different in some 
> environments, it still throws NPE when opening the HS2 web:
> {noformat}
> java.lang.NullPointerException 
> at 
> org.apache.hive.generated.hiveserver2.hiveserver2_jsp._jspService(hiveserver2_jsp.java:286)
>  
> at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:71) 
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) 
> at 
> org.eclipse.jetty.servlet.ServletHolder$NotAsync.service(ServletHolder.java:1443)
>  
> at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:791) 
> at 
> org.eclipse.jetty.servlet.ServletHandler$ChainEnd.doFilter(ServletHandler.java:1626)
> ...{noformat}
> The jetty-runner JspFactory.getDefaultFactory() just returns null.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-27150) Drop single partition can also support direct sql

2023-03-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27150?focusedWorklogId=853827=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853827
 ]

ASF GitHub Bot logged work on HIVE-27150:
-

Author: ASF GitHub Bot
Created on: 30/Mar/23 04:17
Start Date: 30/Mar/23 04:17
Worklog Time Spent: 10m 
  Work Description: saihemanth-cloudera commented on code in PR #4123:
URL: https://github.com/apache/hive/pull/4123#discussion_r1152712297


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/RawStore.java:
##
@@ -459,16 +459,15 @@ boolean doesPartitionExist(String catName, String dbName, 
String tableName,
* @param catName catalog name.
* @param dbName database name.
* @param tableName table name.
-   * @param part_vals list of partition values.
+   * @param partName partition name.
* @return true if the partition was dropped.
* @throws MetaException Error accessing the RDBMS.
* @throws NoSuchObjectException no partition matching this description 
exists
* @throws InvalidObjectException error dropping the statistics for the 
partition
* @throws InvalidInputException error dropping the statistics for the 
partition
*/
-  boolean dropPartition(String catName, String dbName, String tableName,
-  List part_vals) throws MetaException, NoSuchObjectException, 
InvalidObjectException,
-  InvalidInputException;
+  boolean dropPartition(String catName, String dbName, String tableName, 
String partName)

Review Comment:
   I would favor this change instead of passing the table into the drop 
partition API in the object store.





Issue Time Tracking
---

Worklog Id: (was: 853827)
Time Spent: 2h 40m  (was: 2.5h)

> Drop single partition can also support direct sql
> -
>
> Key: HIVE-27150
> URL: https://issues.apache.org/jira/browse/HIVE-27150
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Wechar
>Assignee: Wechar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> *Background:*
> [HIVE-6980|https://issues.apache.org/jira/browse/HIVE-6980] supports direct 
> sql for drop_partitions, we can reuse this huge improvement in drop_partition.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-27179) HS2 WebUI throws NPE when JspFactory loaded from jetty-runner

2023-03-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27179?focusedWorklogId=853826=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853826
 ]

ASF GitHub Bot logged work on HIVE-27179:
-

Author: ASF GitHub Bot
Created on: 30/Mar/23 04:12
Start Date: 30/Mar/23 04:12
Worklog Time Spent: 10m 
  Work Description: saihemanth-cloudera commented on code in PR #4164:
URL: https://github.com/apache/hive/pull/4164#discussion_r1152709966


##
service/src/java/org/apache/hive/service/server/HiveServer2.java:
##
@@ -133,6 +133,8 @@
 import com.google.common.util.concurrent.SettableFuture;
 import com.google.common.util.concurrent.ThreadFactoryBuilder;
 
+import javax.servlet.jsp.JspFactory;

Review Comment:
   I see that you have removed javax.servlet dependency from root and service's 
pom.xml. I'm wondering where this dependency is coming from.





Issue Time Tracking
---

Worklog Id: (was: 853826)
Time Spent: 40m  (was: 0.5h)

> HS2 WebUI throws NPE when JspFactory loaded from jetty-runner
> -
>
> Key: HIVE-27179
> URL: https://issues.apache.org/jira/browse/HIVE-27179
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> In HIVE-17088{*},{*} we resolved a NPE thrown from HS2 WebUI by introducing 
> javax.servlet.jsp-api. It works as expected when the javax.servlet.jsp-api 
> jar prevails jetty-runner jar, but things can be different in some 
> environments, it still throws NPE when opening the HS2 web:
> {noformat}
> java.lang.NullPointerException 
> at 
> org.apache.hive.generated.hiveserver2.hiveserver2_jsp._jspService(hiveserver2_jsp.java:286)
>  
> at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:71) 
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) 
> at 
> org.eclipse.jetty.servlet.ServletHolder$NotAsync.service(ServletHolder.java:1443)
>  
> at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:791) 
> at 
> org.eclipse.jetty.servlet.ServletHandler$ChainEnd.doFilter(ServletHandler.java:1626)
> ...{noformat}
> The jetty-runner JspFactory.getDefaultFactory() just returns null.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-27192) Use normal import instead of shaded import in TestSchemaToolCatalogOps.java

2023-03-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27192?focusedWorklogId=853825=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853825
 ]

ASF GitHub Bot logged work on HIVE-27192:
-

Author: ASF GitHub Bot
Created on: 30/Mar/23 04:02
Start Date: 30/Mar/23 04:02
Worklog Time Spent: 10m 
  Work Description: SourabhBadhya commented on code in PR #4169:
URL: https://github.com/apache/hive/pull/4169#discussion_r1152705138


##
itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/tools/schematool/TestSchemaToolCatalogOps.java:
##
@@ -35,7 +35,7 @@
 import org.apache.hadoop.hive.metastore.client.builder.PartitionBuilder;
 import org.apache.hadoop.hive.metastore.client.builder.TableBuilder;
 import org.apache.hadoop.hive.metastore.conf.MetastoreConf;
-import org.apache.hive.com.google.common.io.Files;
+import com.google.common.io.Files;

Review Comment:
   nit: After the removal of shading pattern, this does not follow alphabetical 
order of imports





Issue Time Tracking
---

Worklog Id: (was: 853825)
Time Spent: 0.5h  (was: 20m)

> Use normal import instead of shaded import in TestSchemaToolCatalogOps.java
> ---
>
> Key: HIVE-27192
> URL: https://issues.apache.org/jira/browse/HIVE-27192
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltán Rátkai
>Assignee: Zoltán Rátkai
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-22383) `alterPartitions` is invoked twice during dynamic partition load causing runtime delay

2023-03-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22383?focusedWorklogId=853814=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853814
 ]

ASF GitHub Bot logged work on HIVE-22383:
-

Author: ASF GitHub Bot
Created on: 30/Mar/23 02:18
Start Date: 30/Mar/23 02:18
Worklog Time Spent: 10m 
  Work Description: rbalamohan commented on code in PR #4161:
URL: https://github.com/apache/hive/pull/4161#discussion_r1152661303


##
ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java:
##
@@ -2951,9 +2951,11 @@ private void setStatsPropAndAlterPartitions(boolean 
resetStatistics, Table tbl,
   validWriteIdList = tableSnapshot.getValidWriteIdList();
   writeId = tableSnapshot.getWriteId();
 }
-getSynchronizedMSC().alter_partitions(tbl.getCatName(), tbl.getDbName(), 
tbl.getTableName(),
-
partitions.stream().map(Partition::getTPartition).collect(Collectors.toList()),
-ec, validWriteIdList, writeId);
+if (!conf.getBoolVar(ConfVars.HIVESTATSAUTOGATHER)){

Review Comment:
   Should this be in the starting of the method itself?





Issue Time Tracking
---

Worklog Id: (was: 853814)
Time Spent: 50m  (was: 40m)

> `alterPartitions` is invoked twice during dynamic partition load causing 
> runtime delay
> --
>
> Key: HIVE-22383
> URL: https://issues.apache.org/jira/browse/HIVE-22383
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Dmitriy Fingerman
>Priority: Major
>  Labels: performance, pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> First invocation in {{Hive::loadDynamicPartitions}}
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L2978
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L2638
> Second invocation in {{BasicStatsTask::aggregateStats}}
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsTask.java#L335
> This leads to good amount of delay in dynamic partition loading.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-27194) Support expression in limit and offset clauses

2023-03-29 Thread vamshi kolanu (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vamshi kolanu reassigned HIVE-27194:



> Support expression in limit and offset clauses
> --
>
> Key: HIVE-27194
> URL: https://issues.apache.org/jira/browse/HIVE-27194
> Project: Hive
>  Issue Type: Task
>  Components: Hive
>Reporter: vamshi kolanu
>Assignee: vamshi kolanu
>Priority: Major
>
> As part of this task, support expressions in both limit and offset clauses. 
> Currently, these clauses are only supporting integers.
> For example: The following expressions will be supported after this change.
> 1. select key from (select * from src limit (1+2*3)) q1;
> 2. select key from (select * from src limit (1+2*3) offset (3*4*5)) q1;



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-22383) `alterPartitions` is invoked twice during dynamic partition load causing runtime delay

2023-03-29 Thread Dmitriy Fingerman (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17706649#comment-17706649
 ] 

Dmitriy Fingerman commented on HIVE-22383:
--

Hi [~rajesh.balamohan], could you please review the PR?

> `alterPartitions` is invoked twice during dynamic partition load causing 
> runtime delay
> --
>
> Key: HIVE-22383
> URL: https://issues.apache.org/jira/browse/HIVE-22383
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Dmitriy Fingerman
>Priority: Major
>  Labels: performance, pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> First invocation in {{Hive::loadDynamicPartitions}}
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L2978
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L2638
> Second invocation in {{BasicStatsTask::aggregateStats}}
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStatsTask.java#L335
> This leads to good amount of delay in dynamic partition loading.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-27165) PART_COL_STATS metastore query not hitting the index

2023-03-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27165?focusedWorklogId=853808=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853808
 ]

ASF GitHub Bot logged work on HIVE-27165:
-

Author: ASF GitHub Bot
Created on: 30/Mar/23 00:25
Start Date: 30/Mar/23 00:25
Worklog Time Spent: 10m 
  Work Description: sonarcloud[bot] commented on PR #4141:
URL: https://github.com/apache/hive/pull/4141#issuecomment-1489516433

   Kudos, SonarCloud Quality Gate passed!  [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive=4141)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=4141=false=BUG)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4141=false=BUG)
 [0 
Bugs](https://sonarcloud.io/project/issues?id=apache_hive=4141=false=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=4141=false=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4141=false=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=4141=false=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4141=false=SECURITY_HOTSPOT)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4141=false=SECURITY_HOTSPOT)
 [0 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4141=false=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive=4141=false=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4141=false=CODE_SMELL)
 [0 Code 
Smells](https://sonarcloud.io/project/issues?id=apache_hive=4141=false=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4141=coverage=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4141=duplicated_lines_density=list)
 No Duplication information
   
   




Issue Time Tracking
---

Worklog Id: (was: 853808)
Time Spent: 50m  (was: 40m)

> PART_COL_STATS metastore query not hitting the index
> 
>
> Key: HIVE-27165
> URL: https://issues.apache.org/jira/browse/HIVE-27165
> Project: Hive
>  Issue Type: Improvement
>Reporter: Hongdan Zhu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The query located here:
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java#L1029-L1032]
> is not hitting an index.  The index contains CAT_NAME whereas this query does 
> not. This was a change made in Hive 3.0, I think.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-27135) AcidUtils#getHdfsDirSnapshots() throws FNFE when a directory is removed in HDFS

2023-03-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27135?focusedWorklogId=853807=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853807
 ]

ASF GitHub Bot logged work on HIVE-27135:
-

Author: ASF GitHub Bot
Created on: 29/Mar/23 23:47
Start Date: 29/Mar/23 23:47
Worklog Time Spent: 10m 
  Work Description: sonarcloud[bot] commented on PR #4114:
URL: https://github.com/apache/hive/pull/4114#issuecomment-1489482725

   Kudos, SonarCloud Quality Gate passed!  [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive=4114)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=4114=false=BUG)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4114=false=BUG)
 [0 
Bugs](https://sonarcloud.io/project/issues?id=apache_hive=4114=false=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=4114=false=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4114=false=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=4114=false=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4114=false=SECURITY_HOTSPOT)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4114=false=SECURITY_HOTSPOT)
 [0 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4114=false=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive=4114=false=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4114=false=CODE_SMELL)
 [0 Code 
Smells](https://sonarcloud.io/project/issues?id=apache_hive=4114=false=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4114=coverage=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4114=duplicated_lines_density=list)
 No Duplication information
   
   




Issue Time Tracking
---

Worklog Id: (was: 853807)
Time Spent: 6h 40m  (was: 6.5h)

> AcidUtils#getHdfsDirSnapshots() throws FNFE when a directory is removed in 
> HDFS
> ---
>
> Key: HIVE-27135
> URL: https://issues.apache.org/jira/browse/HIVE-27135
> Project: Hive
>  Issue Type: Bug
>Reporter: Dayakar M
>Assignee: Dayakar M
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h 40m
>  Remaining Estimate: 0h
>
> AcidUtils#getHdfsDirSnapshots() throws FileNotFoundException when a directory 
> is removed in HDFS while fetching HDFS Snapshots.
> Below testcode can be used to reproduce this issue.
> {code:java}
>  @Test
>   public void 
> testShouldNotThrowFNFEWhenHiveStagingDirectoryIsRemovedWhileFetchingHDFSSnapshots()
>  throws Exception {
> MockFileSystem fs = new MockFileSystem(new HiveConf(),
> new MockFile("mock:/tbl/part1/.hive-staging_dir/-ext-10002", 500, new 
> byte[0]),
> new MockFile("mock:/tbl/part2/.hive-staging_dir", 500, new byte[0]),
> new MockFile("mock:/tbl/part1/_tmp_space.db", 500, new byte[0]),
> new MockFile("mock:/tbl/part1/delta_1_1/bucket--", 500, new 
> byte[0]));
> Path path = new MockPath(fs, "/tbl");
> Path stageDir = new MockPath(fs, "mock:/tbl/part1/.hive-staging_dir");
> FileSystem mockFs = spy(fs);
> Mockito.doThrow(new 
> 

[jira] [Updated] (HIVE-27136) Backport HIVE-27129 to branch-3

2023-03-29 Thread Junlin Zeng (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junlin Zeng updated HIVE-27136:
---
Description: 
Create this ticket to track backport 
 # HIVE-27129

> Backport HIVE-27129 to branch-3
> ---
>
> Key: HIVE-27136
> URL: https://issues.apache.org/jira/browse/HIVE-27136
> Project: Hive
>  Issue Type: Improvement
>Reporter: Junlin Zeng
>Assignee: Junlin Zeng
>Priority: Major
>
> Create this ticket to track backport 
>  # HIVE-27129



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work started] (HIVE-27128) Exception "Can't finish byte read from uncompressed stream DATA position" when querying ORC table

2023-03-29 Thread Dmitriy Fingerman (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-27128 started by Dmitriy Fingerman.

> Exception "Can't finish byte read from uncompressed stream DATA position" 
> when querying ORC table
> -
>
> Key: HIVE-27128
> URL: https://issues.apache.org/jira/browse/HIVE-27128
> Project: Hive
>  Issue Type: Bug
>Reporter: Dmitriy Fingerman
>Assignee: Dmitriy Fingerman
>Priority: Critical
>
> Exception happening when querying an ORC table:
> {code:java}
> Caused by: java.io.EOFException: Can't finish byte read from uncompressed 
> stream DATA position: 393216 length: 393216 range: 23 offset: 376832 
> position: 16384 limit: 16384
>   at 
> org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays(TreeReaderFactory.java:1550)
>   at 
> org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.readOrcByteArrays(TreeReaderFactory.java:1566)
>   at 
> org.apache.orc.impl.TreeReaderFactory$StringDirectTreeReader.nextVector(TreeReaderFactory.java:1662)
>   at 
> org.apache.orc.impl.TreeReaderFactory$StringTreeReader.nextVector(TreeReaderFactory.java:1508)
>   at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedTreeReaderFactory$StringStreamReader.nextVector(EncodedTreeReaderFactory.java:305)
>   at 
> org.apache.hadoop.hive.llap.io.decode.OrcEncodedDataConsumer.decodeBatch(OrcEncodedDataConsumer.java:196)
>   at 
> org.apache.hadoop.hive.llap.io.decode.OrcEncodedDataConsumer.decodeBatch(OrcEncodedDataConsumer.java:66)
>   at 
> org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer.consumeData(EncodedDataConsumer.java:122)
>   at 
> org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer.consumeData(EncodedDataConsumer.java:42)
>   at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:608)
>   at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:434)
>   at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:282)
>   at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:279)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
>   at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:279)
>   at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:118)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at 
> org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer$CpuRecordingCallable.call(EncodedDataConsumer.java:88)
>   at 
> org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer$CpuRecordingCallable.call(EncodedDataConsumer.java:73)
>  {code}
> I created a q-test that reproduces this issue:
> [https://github.com/difin/hive/commits/orc_read_err_qtest]
> This issue happens in Hive starting from the commit that upgraded ORC version 
> in Hive to ORC 1.6.7.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-26949) Backport HIVE-26071 to branch-3

2023-03-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-26949:
--
Labels: pull-request-available  (was: )

> Backport HIVE-26071 to branch-3
> ---
>
> Key: HIVE-26949
> URL: https://issues.apache.org/jira/browse/HIVE-26949
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore, Standalone Metastore
>Reporter: Vihang Karajgaonkar
>Assignee: Junlin Zeng
>Priority: Blocker
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Creating this ticket to backport HIVE-26071 to branch-3.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26949) Backport HIVE-26071 to branch-3

2023-03-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26949?focusedWorklogId=853797=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853797
 ]

ASF GitHub Bot logged work on HIVE-26949:
-

Author: ASF GitHub Bot
Created on: 29/Mar/23 21:00
Start Date: 29/Mar/23 21:00
Worklog Time Spent: 10m 
  Work Description: junlinzeng-db opened a new pull request, #4172:
URL: https://github.com/apache/hive/pull/4172

   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   




Issue Time Tracking
---

Worklog Id: (was: 853797)
Remaining Estimate: 0h
Time Spent: 10m

> Backport HIVE-26071 to branch-3
> ---
>
> Key: HIVE-26949
> URL: https://issues.apache.org/jira/browse/HIVE-26949
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore, Standalone Metastore
>Reporter: Vihang Karajgaonkar
>Assignee: Junlin Zeng
>Priority: Blocker
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Creating this ticket to backport HIVE-26071 to branch-3.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-27128) Exception "Can't finish byte read from uncompressed stream DATA position" when querying ORC table

2023-03-29 Thread Dmitriy Fingerman (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17706598#comment-17706598
 ] 

Dmitriy Fingerman commented on HIVE-27128:
--

This can be fixed after HIVE-26809 - 'Upgrade ORC to 1.8.3' is fixed because 
ORC 1.8.3 contains fix ORC-1393 required to fix this issue.

> Exception "Can't finish byte read from uncompressed stream DATA position" 
> when querying ORC table
> -
>
> Key: HIVE-27128
> URL: https://issues.apache.org/jira/browse/HIVE-27128
> Project: Hive
>  Issue Type: Bug
>Reporter: Dmitriy Fingerman
>Assignee: Dmitriy Fingerman
>Priority: Critical
>
> Exception happening when querying an ORC table:
> {code:java}
> Caused by: java.io.EOFException: Can't finish byte read from uncompressed 
> stream DATA position: 393216 length: 393216 range: 23 offset: 376832 
> position: 16384 limit: 16384
>   at 
> org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays(TreeReaderFactory.java:1550)
>   at 
> org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.readOrcByteArrays(TreeReaderFactory.java:1566)
>   at 
> org.apache.orc.impl.TreeReaderFactory$StringDirectTreeReader.nextVector(TreeReaderFactory.java:1662)
>   at 
> org.apache.orc.impl.TreeReaderFactory$StringTreeReader.nextVector(TreeReaderFactory.java:1508)
>   at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedTreeReaderFactory$StringStreamReader.nextVector(EncodedTreeReaderFactory.java:305)
>   at 
> org.apache.hadoop.hive.llap.io.decode.OrcEncodedDataConsumer.decodeBatch(OrcEncodedDataConsumer.java:196)
>   at 
> org.apache.hadoop.hive.llap.io.decode.OrcEncodedDataConsumer.decodeBatch(OrcEncodedDataConsumer.java:66)
>   at 
> org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer.consumeData(EncodedDataConsumer.java:122)
>   at 
> org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer.consumeData(EncodedDataConsumer.java:42)
>   at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:608)
>   at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:434)
>   at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:282)
>   at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:279)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
>   at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:279)
>   at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:118)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at 
> org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer$CpuRecordingCallable.call(EncodedDataConsumer.java:88)
>   at 
> org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer$CpuRecordingCallable.call(EncodedDataConsumer.java:73)
>  {code}
> I created a q-test that reproduces this issue:
> [https://github.com/difin/hive/commits/orc_read_err_qtest]
> This issue happens in Hive starting from the commit that upgraded ORC version 
> in Hive to ORC 1.6.7.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26997) Iceberg: Vectorization gets disabled at runtime in merge-into statements

2023-03-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26997?focusedWorklogId=853789=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853789
 ]

ASF GitHub Bot logged work on HIVE-26997:
-

Author: ASF GitHub Bot
Created on: 29/Mar/23 20:18
Start Date: 29/Mar/23 20:18
Worklog Time Spent: 10m 
  Work Description: sonarcloud[bot] commented on PR #4162:
URL: https://github.com/apache/hive/pull/4162#issuecomment-1489249100

   Kudos, SonarCloud Quality Gate passed!  [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive=4162)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=4162=false=BUG)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4162=false=BUG)
 [0 
Bugs](https://sonarcloud.io/project/issues?id=apache_hive=4162=false=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=4162=false=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4162=false=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=4162=false=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4162=false=SECURITY_HOTSPOT)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4162=false=SECURITY_HOTSPOT)
 [0 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4162=false=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive=4162=false=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4162=false=CODE_SMELL)
 [0 Code 
Smells](https://sonarcloud.io/project/issues?id=apache_hive=4162=false=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4162=coverage=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4162=duplicated_lines_density=list)
 No Duplication information
   
   




Issue Time Tracking
---

Worklog Id: (was: 853789)
Time Spent: 1.5h  (was: 1h 20m)

> Iceberg: Vectorization gets disabled at runtime in merge-into statements
> 
>
> Key: HIVE-26997
> URL: https://issues.apache.org/jira/browse/HIVE-26997
> Project: Hive
>  Issue Type: Improvement
>  Components: Iceberg integration
>Reporter: Rajesh Balamohan
>Assignee: Zsolt Miskolczi
>Priority: Major
>  Labels: pull-request-available
> Attachments: explain_merge_into.txt
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> *Query:*
> Think of "ssv" table as a table containing trickle feed data in the following 
> query. "store_sales_delete_1" is the destination table.
>  
> {noformat}
> MERGE INTO tpcds_1000_iceberg_mor_v4.store_sales_delete_1 t USING 
> tpcds_1000_update.ssv s ON (t.ss_item_sk = s.ss_item_sk
>                                                                               
>                 AND t.ss_customer_sk=s.ss_customer_sk
>                                                                               
>                 AND t.ss_sold_date_sk = "2451181"
>                                                                               
>                 AND ((Floor((s.ss_item_sk) / 1000) * 1000) BETWEEN 1000 AND 
> 2000)
>                                                                               
>                 AND s.ss_ext_discount_amt < 0.0) 

[jira] [Work logged] (HIVE-27165) PART_COL_STATS metastore query not hitting the index

2023-03-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27165?focusedWorklogId=853788=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853788
 ]

ASF GitHub Bot logged work on HIVE-27165:
-

Author: ASF GitHub Bot
Created on: 29/Mar/23 20:18
Start Date: 29/Mar/23 20:18
Worklog Time Spent: 10m 
  Work Description: sonarcloud[bot] commented on PR #4141:
URL: https://github.com/apache/hive/pull/4141#issuecomment-1489248563

   Kudos, SonarCloud Quality Gate passed!  [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive=4141)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=4141=false=BUG)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4141=false=BUG)
 [0 
Bugs](https://sonarcloud.io/project/issues?id=apache_hive=4141=false=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=4141=false=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4141=false=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=4141=false=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4141=false=SECURITY_HOTSPOT)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4141=false=SECURITY_HOTSPOT)
 [0 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4141=false=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive=4141=false=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4141=false=CODE_SMELL)
 [0 Code 
Smells](https://sonarcloud.io/project/issues?id=apache_hive=4141=false=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4141=coverage=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4141=duplicated_lines_density=list)
 No Duplication information
   
   




Issue Time Tracking
---

Worklog Id: (was: 853788)
Time Spent: 40m  (was: 0.5h)

> PART_COL_STATS metastore query not hitting the index
> 
>
> Key: HIVE-27165
> URL: https://issues.apache.org/jira/browse/HIVE-27165
> Project: Hive
>  Issue Type: Improvement
>Reporter: Hongdan Zhu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> The query located here:
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java#L1029-L1032]
> is not hitting an index.  The index contains CAT_NAME whereas this query does 
> not. This was a change made in Hive 3.0, I think.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26800) Backport HIVE-21755 : Upgrading SQL server backed metastore when changing

2023-03-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26800?focusedWorklogId=853773=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853773
 ]

ASF GitHub Bot logged work on HIVE-26800:
-

Author: ASF GitHub Bot
Created on: 29/Mar/23 18:59
Start Date: 29/Mar/23 18:59
Worklog Time Spent: 10m 
  Work Description: amanraj2520 commented on PR #4021:
URL: https://github.com/apache/hive/pull/4021#issuecomment-1489140678

   @abstractdog @zabetak @vihangk1 Can you please approve and merge this.




Issue Time Tracking
---

Worklog Id: (was: 853773)
Time Spent: 0.5h  (was: 20m)

> Backport HIVE-21755 : Upgrading SQL server backed metastore when changing
> -
>
> Key: HIVE-26800
> URL: https://issues.apache.org/jira/browse/HIVE-26800
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Aman Raj
>Assignee: Aman Raj
>Priority: Major
>  Labels: hive-3.2.0-must, pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26900) Error message not representing the correct line number with a syntax error in a HQL File

2023-03-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26900?focusedWorklogId=853760=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853760
 ]

ASF GitHub Bot logged work on HIVE-26900:
-

Author: ASF GitHub Bot
Created on: 29/Mar/23 17:58
Start Date: 29/Mar/23 17:58
Worklog Time Spent: 10m 
  Work Description: sonarcloud[bot] commented on PR #4171:
URL: https://github.com/apache/hive/pull/4171#issuecomment-1489056490

   Kudos, SonarCloud Quality Gate passed!  [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive=4171)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=4171=false=BUG)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4171=false=BUG)
 [0 
Bugs](https://sonarcloud.io/project/issues?id=apache_hive=4171=false=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=4171=false=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4171=false=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=4171=false=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4171=false=SECURITY_HOTSPOT)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4171=false=SECURITY_HOTSPOT)
 [0 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4171=false=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive=4171=false=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4171=false=CODE_SMELL)
 [2 Code 
Smells](https://sonarcloud.io/project/issues?id=apache_hive=4171=false=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4171=coverage=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4171=duplicated_lines_density=list)
 No Duplication information
   
   




Issue Time Tracking
---

Worklog Id: (was: 853760)
Time Spent: 1h 40m  (was: 1.5h)

> Error message not representing the correct line number with a syntax error in 
> a HQL File
> 
>
> Key: HIVE-26900
> URL: https://issues.apache.org/jira/browse/HIVE-26900
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.2, 4.0.0-alpha-1, 4.0.0-alpha-2
>Reporter: Vikram Ahuja
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> When a wrong syntax is added in a HQL file, the error thrown by beeline while 
> running the HQL file is having the wrong line number.  The line number and 
> even the position is incorrect. Seems like parser is not considering spaces 
> and new lines and always throwing the error on line number 1 irrespective of 
> what line the error is on in the HQL file
>  
> For instance, consider the following test.hql file:
>  # --comment
>  # --comment
>  # SET hive.server2.logging.operation.enabled=true;
>  # SET hive.server2.logging.operation.level=VERBOSE;
>  # show tables;
>  #  
>  #  
>  #       CREATE TABLEE DUMMY;
>  
> when we call !run  test.hql in beeline or trigger ./beeline -u 
> jdbc:hive2://localhost:1 -f test.hql, The issue thrown by beeline is
> >>> CREATE TABLEE DUMMY;
> Error: Error while compiling statement: FAILED: ParseException line 1:7 
> cannot recongize input near 'CREATE' 

[jira] [Work logged] (HIVE-27135) AcidUtils#getHdfsDirSnapshots() throws FNFE when a directory is removed in HDFS

2023-03-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27135?focusedWorklogId=853745=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853745
 ]

ASF GitHub Bot logged work on HIVE-27135:
-

Author: ASF GitHub Bot
Created on: 29/Mar/23 17:03
Start Date: 29/Mar/23 17:03
Worklog Time Spent: 10m 
  Work Description: mdayakar commented on code in PR #4114:
URL: https://github.com/apache/hive/pull/4114#discussion_r1152246387


##
ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java:
##
@@ -1538,32 +1538,36 @@ private static HdfsDirSnapshot addToSnapshot(Map dirToSna
   public static Map getHdfsDirSnapshots(final 
FileSystem fs, final Path path)
   throws IOException {
 Map dirToSnapshots = new HashMap<>();
-RemoteIterator itr = FileUtils.listFiles(fs, path, 
true, acidHiddenFileFilter);
-while (itr.hasNext()) {
-  FileStatus fStatus = itr.next();
-  Path fPath = fStatus.getPath();
-  if (fStatus.isDirectory() && acidTempDirFilter.accept(fPath)) {
-addToSnapshot(dirToSnapshots, fPath);
-  } else {
-Path parentDirPath = fPath.getParent();
-if (acidTempDirFilter.accept(parentDirPath)) {
-  while (isChildOfDelta(parentDirPath, path)) {
-// Some cases there are other directory layers between the delta 
and the datafiles
-// (export-import mm table, insert with union all to mm table, 
skewed tables).
-// But it does not matter for the AcidState, we just need the 
deltas and the data files
-// So build the snapshot with the files inside the delta directory
-parentDirPath = parentDirPath.getParent();
-  }
-  HdfsDirSnapshot dirSnapshot = addToSnapshot(dirToSnapshots, 
parentDirPath);
-  // We're not filtering out the metadata file and acid format file,
-  // as they represent parts of a valid snapshot
-  // We're not using the cached values downstream, but we can 
potentially optimize more in a follow-up task
-  if 
(fStatus.getPath().toString().contains(MetaDataFile.METADATA_FILE)) {
-dirSnapshot.addMetadataFile(fStatus);
-  } else if 
(fStatus.getPath().toString().contains(OrcAcidVersion.ACID_FORMAT)) {
-dirSnapshot.addOrcAcidFormatFile(fStatus);
-  } else {
-dirSnapshot.addFile(fStatus);
+Deque> stack = new ArrayDeque<>();
+stack.push(FileUtils.listLocatedStatusIterator(fs, path, 
acidHiddenFileFilter));
+while (!stack.isEmpty()) {
+  RemoteIterator itr = stack.pop();
+  while (itr.hasNext()) {
+FileStatus fStatus = itr.next();
+Path fPath = fStatus.getPath();
+if (fStatus.isDirectory()) {
+  stack.push(FileUtils.listLocatedStatusIterator(fs, fPath, 
acidHiddenFileFilter));

Review Comment:
   Here it will not list empty directories, actually the above if condition is 
obsolete in old code. Tested with old and modified code, both don't add empty 
directories.





Issue Time Tracking
---

Worklog Id: (was: 853745)
Time Spent: 6.5h  (was: 6h 20m)

> AcidUtils#getHdfsDirSnapshots() throws FNFE when a directory is removed in 
> HDFS
> ---
>
> Key: HIVE-27135
> URL: https://issues.apache.org/jira/browse/HIVE-27135
> Project: Hive
>  Issue Type: Bug
>Reporter: Dayakar M
>Assignee: Dayakar M
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6.5h
>  Remaining Estimate: 0h
>
> AcidUtils#getHdfsDirSnapshots() throws FileNotFoundException when a directory 
> is removed in HDFS while fetching HDFS Snapshots.
> Below testcode can be used to reproduce this issue.
> {code:java}
>  @Test
>   public void 
> testShouldNotThrowFNFEWhenHiveStagingDirectoryIsRemovedWhileFetchingHDFSSnapshots()
>  throws Exception {
> MockFileSystem fs = new MockFileSystem(new HiveConf(),
> new MockFile("mock:/tbl/part1/.hive-staging_dir/-ext-10002", 500, new 
> byte[0]),
> new MockFile("mock:/tbl/part2/.hive-staging_dir", 500, new byte[0]),
> new MockFile("mock:/tbl/part1/_tmp_space.db", 500, new byte[0]),
> new MockFile("mock:/tbl/part1/delta_1_1/bucket--", 500, new 
> byte[0]));
> Path path = new MockPath(fs, "/tbl");
> Path stageDir = new MockPath(fs, "mock:/tbl/part1/.hive-staging_dir");
> FileSystem mockFs = spy(fs);
> Mockito.doThrow(new 
> FileNotFoundException("")).when(mockFs).listLocatedStatus(eq(stageDir));
> try {
>   Map hdfsDirSnapshots = 
> AcidUtils.getHdfsDirSnapshots(mockFs, path);
>   Assert.assertEquals(1, hdfsDirSnapshots.size());
> }
> catch (FileNotFoundException fnf) {
>   fail("Should not throw FileNotFoundException when a 

[jira] [Commented] (HIVE-27193) Database names starting with '@' cause error during ALTER/DROP table.

2023-03-29 Thread Oliver Schiller (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17706513#comment-17706513
 ] 

Oliver Schiller commented on HIVE-27193:


After poking a while around in the code and trying to understand how everything 
plays together, I'm still uncertain where the catalog name should be prepended. 
I don't know whether the following assumptions/invariants are correct:
 * If the request object does not have its own catalog name field, the catalog 
name is prepended and sent to the metastore via the dbname. This happens always 
in class (Session)HiveMetaStoreClient.
 * If a complete request object is passed to HiveMetaStoreClient, which is then 
just passed along, the caller ensures that the catalog is prepended.

If they are correct, going through HiveMetaStoreClient following these 
assumptions should ensure that a leading '@' does not cause issues since a 
catalog is always prepended (however, if the catalog name is allowed to contain 
#, another problem would emerge).

> Database names starting with '@' cause error during ALTER/DROP table.
> -
>
> Key: HIVE-27193
> URL: https://issues.apache.org/jira/browse/HIVE-27193
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Standalone Metastore
>Affects Versions: 4.0.0-alpha-2
>Reporter: Oliver Schiller
>Priority: Major
>
> The creation of database that start with '@' is supported:
>  
> {code:java}
> create database `@test`;{code}
>  
> The creation of a table in this database works:
>  
> {code:java}
> create table `@test`.testtable (c1 integer);{code}
> However, dropping or altering the table result in an error:
>  
> {code:java}
> drop table `@test`.testtable;
> FAILED: SemanticException Unable to fetch table testtable. @test is prepended 
> with the catalog marker but does not appear to have a catalog name in it
> Error: Error while compiling statement: FAILED: SemanticException Unable to 
> fetch table testtable. @test is prepended with the catalog marker but does 
> not appear to have a catalog name in it (state=42000,code=4)
> alter table `@test`.testtable add columns (c2 integer);
> FAILED: SemanticException Unable to fetch table testtable. @test is prepended 
> with the catalog marker but does not appear to have a catalog name in it
> Error: Error while compiling statement: FAILED: SemanticException Unable to 
> fetch table testtable. @test is prepended with the catalog marker but does 
> not appear to have a catalog name in it (state=42000,code=4)
> {code}
>  
> Relevant snippet of stack trace:
>  
> {{}}
> {code:java}
> org.apache.hadoop.hive.metastore.api.MetaException: @TEST is prepended with 
> the catalog marker but does not appear to have a catalog name in it at 
> org.apache.hadoop.hive.metastore.utils.MetaStoreUtils.parseDbName(MetaStoreUtils.java:1031
> at 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.getTempTable(SessionHiveMetaStoreClient.java:651)
> at 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.getTable(SessionHiveMetaStoreClient.java:279)
> at 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.getTable(SessionHiveMetaStoreClient.java:273)
> at 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.getTable(SessionHiveMetaStoreClient.java:258)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropTable(HiveMetaStoreClient.java:1982)org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropTable(HiveMetaStoreClient.java:1957)
> ...{code}
> {{}}
>  
> My suspicion is that this caused by the implementation of getTempTable and 
> how it is called. The method getTempTable calls parseDbName assuming that the 
> given dbname might be prefixed with a catalog name. I'm wondering whether 
> this is correct at this layer. From poking a bit around, it appears to me 
> that the catalog name is typically prepended when making the actual thrift 
> call.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26900) Error message not representing the correct line number with a syntax error in a HQL File

2023-03-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26900?focusedWorklogId=853744=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853744
 ]

ASF GitHub Bot logged work on HIVE-26900:
-

Author: ASF GitHub Bot
Created on: 29/Mar/23 17:00
Start Date: 29/Mar/23 17:00
Worklog Time Spent: 10m 
  Work Description: shreeyasand opened a new pull request, #4171:
URL: https://github.com/apache/hive/pull/4171

   …th a syntax error in a HQL File
   
   
   
   ### What changes were proposed in this pull request?
   
   In the Beeline class: 
   - A new method executeReader() has been introduced specifically to read hql 
files. It makes one string out of all the contents of the hql file separated by 
newline characters (the comments are excluded).
   
   In the Commands class:
   - Since handling multiple lines of query for hql files has already been 
addressed in the executeReader method, we limit the handleMultipleLineCmd() 
method to every other scenario besides when reading an hql file.
   
   In both Beeline.java and Commands.java:
   - Trimming of the string/sql has been removed while reading hql file 
contents. This is achieved whenever getOpts().getScriptFile() equals null (ie 
this is for every situation except when reading an hql file). This is done so 
that the whitespaces and empty lines are not ignored while counting the line 
numbers.
   
   ### Why are the changes needed?
   
   - Hive Cli throws error line number correctly when reading HQL files, but 
Beeline does not. These changes are needed so that the error line number is 
thrown correctly and there is no discrepancy between the functioning of Beeline 
and Hive Cli. 
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   - Error message in Beeline was not representing the correct line number 
prior to the changes. Now Beeline prints the correct error line number.
   
   
   
   ### How was this patch tested?
   
   - The testing was done locally on Beeline with multiple scenarios. The test 
were verified against the correctly functioning Hive Cli.
   - As an example, for the given hql file:
   https://user-images.githubusercontent.com/50237152/222977016-e8a72f33-2f47-4ad4-aeff-2afb6f4a3bc9.png;>
   Error message prior to the changes:
   https://user-images.githubusercontent.com/50237152/222977044-90f746ee-1958-4c6a-9627-c1c1e2a173cc.png;>
   Error message after the changes:
   https://user-images.githubusercontent.com/50237152/222977064-d19b6bb8-b2bc-4292-a24a-1a14d04ab3eb.png;>




Issue Time Tracking
---

Worklog Id: (was: 853744)
Time Spent: 1.5h  (was: 1h 20m)

> Error message not representing the correct line number with a syntax error in 
> a HQL File
> 
>
> Key: HIVE-26900
> URL: https://issues.apache.org/jira/browse/HIVE-26900
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.2, 4.0.0-alpha-1, 4.0.0-alpha-2
>Reporter: Vikram Ahuja
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> When a wrong syntax is added in a HQL file, the error thrown by beeline while 
> running the HQL file is having the wrong line number.  The line number and 
> even the position is incorrect. Seems like parser is not considering spaces 
> and new lines and always throwing the error on line number 1 irrespective of 
> what line the error is on in the HQL file
>  
> For instance, consider the following test.hql file:
>  # --comment
>  # --comment
>  # SET hive.server2.logging.operation.enabled=true;
>  # SET hive.server2.logging.operation.level=VERBOSE;
>  # show tables;
>  #  
>  #  
>  #       CREATE TABLEE DUMMY;
>  
> when we call !run  test.hql in beeline or trigger ./beeline -u 
> jdbc:hive2://localhost:1 -f test.hql, The issue thrown by beeline is
> >>> CREATE TABLEE DUMMY;
> Error: Error while compiling statement: FAILED: ParseException line 1:7 
> cannot recongize input near 'CREATE' 'TABLEE' 'DUMMY' in ddl statement 
> (state=42000,code=4)
> The parser seems to be taking all the lines from 1 and is ignoring spaces in 
> the line.
> The error line in the parse exception is shown as 1:7 but it should have been 
> 8:13.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-27135) AcidUtils#getHdfsDirSnapshots() throws FNFE when a directory is removed in HDFS

2023-03-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27135?focusedWorklogId=853743=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853743
 ]

ASF GitHub Bot logged work on HIVE-27135:
-

Author: ASF GitHub Bot
Created on: 29/Mar/23 16:59
Start Date: 29/Mar/23 16:59
Worklog Time Spent: 10m 
  Work Description: mdayakar commented on code in PR #4114:
URL: https://github.com/apache/hive/pull/4114#discussion_r1152241772


##
common/src/java/org/apache/hadoop/hive/common/FileUtils.java:
##
@@ -1376,6 +1376,12 @@ public static RemoteIterator 
listStatusIterator(FileSystem fs, Path
 status -> filter.accept(status.getPath()));
   }
 
+  public static RemoteIterator 
listLocatedStatusIterator(FileSystem fs, Path path, PathFilter filter)

Review Comment:
   Changed code to use FileStatus object and using existing 
_org.apache.hadoop.hive.common.FileUtils#listStatusIterator()_ API which is 
used in 
_org.apache.hadoop.hive.ql.io.AcidUtils#getHdfsDirSnapshotsForCleaner()_ API.





Issue Time Tracking
---

Worklog Id: (was: 853743)
Time Spent: 6h 20m  (was: 6h 10m)

> AcidUtils#getHdfsDirSnapshots() throws FNFE when a directory is removed in 
> HDFS
> ---
>
> Key: HIVE-27135
> URL: https://issues.apache.org/jira/browse/HIVE-27135
> Project: Hive
>  Issue Type: Bug
>Reporter: Dayakar M
>Assignee: Dayakar M
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>
> AcidUtils#getHdfsDirSnapshots() throws FileNotFoundException when a directory 
> is removed in HDFS while fetching HDFS Snapshots.
> Below testcode can be used to reproduce this issue.
> {code:java}
>  @Test
>   public void 
> testShouldNotThrowFNFEWhenHiveStagingDirectoryIsRemovedWhileFetchingHDFSSnapshots()
>  throws Exception {
> MockFileSystem fs = new MockFileSystem(new HiveConf(),
> new MockFile("mock:/tbl/part1/.hive-staging_dir/-ext-10002", 500, new 
> byte[0]),
> new MockFile("mock:/tbl/part2/.hive-staging_dir", 500, new byte[0]),
> new MockFile("mock:/tbl/part1/_tmp_space.db", 500, new byte[0]),
> new MockFile("mock:/tbl/part1/delta_1_1/bucket--", 500, new 
> byte[0]));
> Path path = new MockPath(fs, "/tbl");
> Path stageDir = new MockPath(fs, "mock:/tbl/part1/.hive-staging_dir");
> FileSystem mockFs = spy(fs);
> Mockito.doThrow(new 
> FileNotFoundException("")).when(mockFs).listLocatedStatus(eq(stageDir));
> try {
>   Map hdfsDirSnapshots = 
> AcidUtils.getHdfsDirSnapshots(mockFs, path);
>   Assert.assertEquals(1, hdfsDirSnapshots.size());
> }
> catch (FileNotFoundException fnf) {
>   fail("Should not throw FileNotFoundException when a directory is 
> removed while fetching HDFSSnapshots");
> }
>   }{code}
> This issue got fixed as a part of HIVE-26481 but here its not fixed 
> completely. 
> [Here|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java#L1541]
>  FileUtils.listFiles() API which returns a RemoteIterator. 
> So while iterating over, it checks if it is a directory and recursive listing 
> then it will try to list files from that directory but if that directory is 
> removed by other thread/task then it throws FileNotFoundException. Here the 
> directory which got removed is the .staging directory which needs to be 
> excluded through by using passed filter.
>  
> So here we can use same logic written in 
> _org.apache.hadoop.hive.ql.io.AcidUtils#getHdfsDirSnapshotsForCleaner()_ API 
> to avoid FileNotFoundException.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26900) Error message not representing the correct line number with a syntax error in a HQL File

2023-03-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26900?focusedWorklogId=853741=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853741
 ]

ASF GitHub Bot logged work on HIVE-26900:
-

Author: ASF GitHub Bot
Created on: 29/Mar/23 16:58
Start Date: 29/Mar/23 16:58
Worklog Time Spent: 10m 
  Work Description: shreeyasand closed pull request #4168: HIVE-26900: 
Error message not representing the correct line number wi…
URL: https://github.com/apache/hive/pull/4168




Issue Time Tracking
---

Worklog Id: (was: 853741)
Time Spent: 1h 20m  (was: 1h 10m)

> Error message not representing the correct line number with a syntax error in 
> a HQL File
> 
>
> Key: HIVE-26900
> URL: https://issues.apache.org/jira/browse/HIVE-26900
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.2, 4.0.0-alpha-1, 4.0.0-alpha-2
>Reporter: Vikram Ahuja
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> When a wrong syntax is added in a HQL file, the error thrown by beeline while 
> running the HQL file is having the wrong line number.  The line number and 
> even the position is incorrect. Seems like parser is not considering spaces 
> and new lines and always throwing the error on line number 1 irrespective of 
> what line the error is on in the HQL file
>  
> For instance, consider the following test.hql file:
>  # --comment
>  # --comment
>  # SET hive.server2.logging.operation.enabled=true;
>  # SET hive.server2.logging.operation.level=VERBOSE;
>  # show tables;
>  #  
>  #  
>  #       CREATE TABLEE DUMMY;
>  
> when we call !run  test.hql in beeline or trigger ./beeline -u 
> jdbc:hive2://localhost:1 -f test.hql, The issue thrown by beeline is
> >>> CREATE TABLEE DUMMY;
> Error: Error while compiling statement: FAILED: ParseException line 1:7 
> cannot recongize input near 'CREATE' 'TABLEE' 'DUMMY' in ddl statement 
> (state=42000,code=4)
> The parser seems to be taking all the lines from 1 and is ignoring spaces in 
> the line.
> The error line in the parse exception is shown as 1:7 but it should have been 
> 8:13.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-27135) AcidUtils#getHdfsDirSnapshots() throws FNFE when a directory is removed in HDFS

2023-03-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27135?focusedWorklogId=853739=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853739
 ]

ASF GitHub Bot logged work on HIVE-27135:
-

Author: ASF GitHub Bot
Created on: 29/Mar/23 16:57
Start Date: 29/Mar/23 16:57
Worklog Time Spent: 10m 
  Work Description: mdayakar commented on code in PR #4114:
URL: https://github.com/apache/hive/pull/4114#discussion_r1152239566


##
ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java:
##
@@ -1538,32 +1538,36 @@ private static HdfsDirSnapshot addToSnapshot(Map dirToSna
   public static Map getHdfsDirSnapshots(final 
FileSystem fs, final Path path)
   throws IOException {
 Map dirToSnapshots = new HashMap<>();
-RemoteIterator itr = FileUtils.listFiles(fs, path, 
true, acidHiddenFileFilter);
-while (itr.hasNext()) {
-  FileStatus fStatus = itr.next();
-  Path fPath = fStatus.getPath();
-  if (fStatus.isDirectory() && acidTempDirFilter.accept(fPath)) {
-addToSnapshot(dirToSnapshots, fPath);
-  } else {
-Path parentDirPath = fPath.getParent();
-if (acidTempDirFilter.accept(parentDirPath)) {
-  while (isChildOfDelta(parentDirPath, path)) {
-// Some cases there are other directory layers between the delta 
and the datafiles
-// (export-import mm table, insert with union all to mm table, 
skewed tables).
-// But it does not matter for the AcidState, we just need the 
deltas and the data files
-// So build the snapshot with the files inside the delta directory
-parentDirPath = parentDirPath.getParent();
-  }
-  HdfsDirSnapshot dirSnapshot = addToSnapshot(dirToSnapshots, 
parentDirPath);
-  // We're not filtering out the metadata file and acid format file,
-  // as they represent parts of a valid snapshot
-  // We're not using the cached values downstream, but we can 
potentially optimize more in a follow-up task
-  if 
(fStatus.getPath().toString().contains(MetaDataFile.METADATA_FILE)) {
-dirSnapshot.addMetadataFile(fStatus);
-  } else if 
(fStatus.getPath().toString().contains(OrcAcidVersion.ACID_FORMAT)) {
-dirSnapshot.addOrcAcidFormatFile(fStatus);
-  } else {
-dirSnapshot.addFile(fStatus);
+Deque> stack = new ArrayDeque<>();

Review Comment:
   Updated accordingly.





Issue Time Tracking
---

Worklog Id: (was: 853739)
Time Spent: 6h 10m  (was: 6h)

> AcidUtils#getHdfsDirSnapshots() throws FNFE when a directory is removed in 
> HDFS
> ---
>
> Key: HIVE-27135
> URL: https://issues.apache.org/jira/browse/HIVE-27135
> Project: Hive
>  Issue Type: Bug
>Reporter: Dayakar M
>Assignee: Dayakar M
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> AcidUtils#getHdfsDirSnapshots() throws FileNotFoundException when a directory 
> is removed in HDFS while fetching HDFS Snapshots.
> Below testcode can be used to reproduce this issue.
> {code:java}
>  @Test
>   public void 
> testShouldNotThrowFNFEWhenHiveStagingDirectoryIsRemovedWhileFetchingHDFSSnapshots()
>  throws Exception {
> MockFileSystem fs = new MockFileSystem(new HiveConf(),
> new MockFile("mock:/tbl/part1/.hive-staging_dir/-ext-10002", 500, new 
> byte[0]),
> new MockFile("mock:/tbl/part2/.hive-staging_dir", 500, new byte[0]),
> new MockFile("mock:/tbl/part1/_tmp_space.db", 500, new byte[0]),
> new MockFile("mock:/tbl/part1/delta_1_1/bucket--", 500, new 
> byte[0]));
> Path path = new MockPath(fs, "/tbl");
> Path stageDir = new MockPath(fs, "mock:/tbl/part1/.hive-staging_dir");
> FileSystem mockFs = spy(fs);
> Mockito.doThrow(new 
> FileNotFoundException("")).when(mockFs).listLocatedStatus(eq(stageDir));
> try {
>   Map hdfsDirSnapshots = 
> AcidUtils.getHdfsDirSnapshots(mockFs, path);
>   Assert.assertEquals(1, hdfsDirSnapshots.size());
> }
> catch (FileNotFoundException fnf) {
>   fail("Should not throw FileNotFoundException when a directory is 
> removed while fetching HDFSSnapshots");
> }
>   }{code}
> This issue got fixed as a part of HIVE-26481 but here its not fixed 
> completely. 
> [Here|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java#L1541]
>  FileUtils.listFiles() API which returns a RemoteIterator. 
> So while iterating over, it checks if it is a directory and recursive listing 
> then it will try to list files from that directory but if that directory is 
> removed by other thread/task then it throws 

[jira] [Work logged] (HIVE-26400) Provide docker images for Hive

2023-03-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26400?focusedWorklogId=853736=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853736
 ]

ASF GitHub Bot logged work on HIVE-26400:
-

Author: ASF GitHub Bot
Created on: 29/Mar/23 16:41
Start Date: 29/Mar/23 16:41
Worklog Time Spent: 10m 
  Work Description: TuroczyX commented on PR #3448:
URL: https://github.com/apache/hive/pull/3448#issuecomment-1488941527

   Seems like the build is broken. @deniskuzZ Could you please re-start?




Issue Time Tracking
---

Worklog Id: (was: 853736)
Time Spent: 11h 40m  (was: 11.5h)

> Provide docker images for Hive
> --
>
> Key: HIVE-26400
> URL: https://issues.apache.org/jira/browse/HIVE-26400
> Project: Hive
>  Issue Type: Sub-task
>  Components: Build Infrastructure
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Blocker
>  Labels: hive-4.0.0-must, pull-request-available
>  Time Spent: 11h 40m
>  Remaining Estimate: 0h
>
> Make Apache Hive be able to run inside docker container in pseudo-distributed 
> mode, with MySQL/Derby as its back database, provide the following:
>  * Quick-start/Debugging/Prepare a test env for Hive;
>  * Tools to build target image with specified version of Hive and its 
> dependencies;
>  * Images can be used as the basis for the Kubernetes operator.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26400) Provide docker images for Hive

2023-03-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26400?focusedWorklogId=853735=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853735
 ]

ASF GitHub Bot logged work on HIVE-26400:
-

Author: ASF GitHub Bot
Created on: 29/Mar/23 16:37
Start Date: 29/Mar/23 16:37
Worklog Time Spent: 10m 
  Work Description: TuroczyX commented on PR #3448:
URL: https://github.com/apache/hive/pull/3448#issuecomment-1488936926

   > > > Should be included in this initiative also create an docker image for 
the hive metastore standalone?
   > > > Something like this: 
https://techjogging.com/standalone-hive-metastore-presto-docker.html
   > > > 
https://github.com/arempter/hive-metastore-docker/blob/master/Dockerfile
   > > > 
https://github.com/aws-samples/hive-emr-on-eks/blob/main/docker/Dockerfile
   > > > Thanks
   > > 
   > > 
   > > It is a good point. @dengzhhu653 @deniskuzZ @ayushtkn @abstractdog What 
do you think about it?
   > 
   > The image can serve both HS2 and Metastore, as you can see in the README: 
https://github.com/apache/hive/pull/3448/files#diff-75345b4702a737ff955983bea3daeac9243e26ef1d2dc0398a31ef28380da9cb.
 Separating them needs another build, makes it a bit hard to maintain in the 
public repo.
   
   Understandable. 




Issue Time Tracking
---

Worklog Id: (was: 853735)
Time Spent: 11.5h  (was: 11h 20m)

> Provide docker images for Hive
> --
>
> Key: HIVE-26400
> URL: https://issues.apache.org/jira/browse/HIVE-26400
> Project: Hive
>  Issue Type: Sub-task
>  Components: Build Infrastructure
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Blocker
>  Labels: hive-4.0.0-must, pull-request-available
>  Time Spent: 11.5h
>  Remaining Estimate: 0h
>
> Make Apache Hive be able to run inside docker container in pseudo-distributed 
> mode, with MySQL/Derby as its back database, provide the following:
>  * Quick-start/Debugging/Prepare a test env for Hive;
>  * Tools to build target image with specified version of Hive and its 
> dependencies;
>  * Images can be used as the basis for the Kubernetes operator.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-27193) Database names starting with '@' cause error during ALTER/DROP table.

2023-03-29 Thread Oliver Schiller (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17706501#comment-17706501
 ] 

Oliver Schiller commented on HIVE-27193:


The code in getTempTable was added in 
[https://github.com/apache/hive/pull/3072]. It seems that it was added to deal 
with changes mades in getPartitionsByNames.

> Database names starting with '@' cause error during ALTER/DROP table.
> -
>
> Key: HIVE-27193
> URL: https://issues.apache.org/jira/browse/HIVE-27193
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Standalone Metastore
>Affects Versions: 4.0.0-alpha-2
>Reporter: Oliver Schiller
>Priority: Major
>
> The creation of database that start with '@' is supported:
>  
> {code:java}
> create database `@test`;{code}
>  
> The creation of a table in this database works:
>  
> {code:java}
> create table `@test`.testtable (c1 integer);{code}
> However, dropping or altering the table result in an error:
>  
> {code:java}
> drop table `@test`.testtable;
> FAILED: SemanticException Unable to fetch table testtable. @test is prepended 
> with the catalog marker but does not appear to have a catalog name in it
> Error: Error while compiling statement: FAILED: SemanticException Unable to 
> fetch table testtable. @test is prepended with the catalog marker but does 
> not appear to have a catalog name in it (state=42000,code=4)
> alter table `@test`.testtable add columns (c2 integer);
> FAILED: SemanticException Unable to fetch table testtable. @test is prepended 
> with the catalog marker but does not appear to have a catalog name in it
> Error: Error while compiling statement: FAILED: SemanticException Unable to 
> fetch table testtable. @test is prepended with the catalog marker but does 
> not appear to have a catalog name in it (state=42000,code=4)
> {code}
>  
> Relevant snippet of stack trace:
>  
> {{}}
> {code:java}
> org.apache.hadoop.hive.metastore.api.MetaException: @TEST is prepended with 
> the catalog marker but does not appear to have a catalog name in it at 
> org.apache.hadoop.hive.metastore.utils.MetaStoreUtils.parseDbName(MetaStoreUtils.java:1031
> at 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.getTempTable(SessionHiveMetaStoreClient.java:651)
> at 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.getTable(SessionHiveMetaStoreClient.java:279)
> at 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.getTable(SessionHiveMetaStoreClient.java:273)
> at 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.getTable(SessionHiveMetaStoreClient.java:258)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropTable(HiveMetaStoreClient.java:1982)org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropTable(HiveMetaStoreClient.java:1957)
> ...{code}
> {{}}
>  
> My suspicion is that this caused by the implementation of getTempTable and 
> how it is called. The method getTempTable calls parseDbName assuming that the 
> given dbname might be prefixed with a catalog name. I'm wondering whether 
> this is correct at this layer. From poking a bit around, it appears to me 
> that the catalog name is typically prepended when making the actual thrift 
> call.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26809) Upgrade ORC to 1.8.3

2023-03-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26809?focusedWorklogId=853704=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853704
 ]

ASF GitHub Bot logged work on HIVE-26809:
-

Author: ASF GitHub Bot
Created on: 29/Mar/23 14:20
Start Date: 29/Mar/23 14:20
Worklog Time Spent: 10m 
  Work Description: sonarcloud[bot] commented on PR #4121:
URL: https://github.com/apache/hive/pull/4121#issuecomment-1488718775

   Kudos, SonarCloud Quality Gate passed!  [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive=4121)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=4121=false=BUG)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4121=false=BUG)
 [0 
Bugs](https://sonarcloud.io/project/issues?id=apache_hive=4121=false=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=4121=false=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4121=false=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=4121=false=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4121=false=SECURITY_HOTSPOT)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4121=false=SECURITY_HOTSPOT)
 [0 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4121=false=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive=4121=false=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4121=false=CODE_SMELL)
 [2 Code 
Smells](https://sonarcloud.io/project/issues?id=apache_hive=4121=false=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4121=coverage=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4121=duplicated_lines_density=list)
 No Duplication information
   
   




Issue Time Tracking
---

Worklog Id: (was: 853704)
Time Spent: 8h 40m  (was: 8.5h)

> Upgrade ORC to 1.8.3
> 
>
> Key: HIVE-26809
> URL: https://issues.apache.org/jira/browse/HIVE-26809
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Dmitriy Fingerman
>Assignee: Zoltán Rátkai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 8h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-27192) Use normal import instead of shaded import in TestSchemaToolCatalogOps.java

2023-03-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27192?focusedWorklogId=853701=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853701
 ]

ASF GitHub Bot logged work on HIVE-27192:
-

Author: ASF GitHub Bot
Created on: 29/Mar/23 14:13
Start Date: 29/Mar/23 14:13
Worklog Time Spent: 10m 
  Work Description: sonarcloud[bot] commented on PR #4169:
URL: https://github.com/apache/hive/pull/4169#issuecomment-1488706465

   Kudos, SonarCloud Quality Gate passed!  [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive=4169)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=4169=false=BUG)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4169=false=BUG)
 [0 
Bugs](https://sonarcloud.io/project/issues?id=apache_hive=4169=false=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=4169=false=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4169=false=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=4169=false=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4169=false=SECURITY_HOTSPOT)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4169=false=SECURITY_HOTSPOT)
 [0 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4169=false=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive=4169=false=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4169=false=CODE_SMELL)
 [0 Code 
Smells](https://sonarcloud.io/project/issues?id=apache_hive=4169=false=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4169=coverage=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4169=duplicated_lines_density=list)
 No Duplication information
   
   




Issue Time Tracking
---

Worklog Id: (was: 853701)
Time Spent: 20m  (was: 10m)

> Use normal import instead of shaded import in TestSchemaToolCatalogOps.java
> ---
>
> Key: HIVE-27192
> URL: https://issues.apache.org/jira/browse/HIVE-27192
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltán Rátkai
>Assignee: Zoltán Rátkai
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27192) Use normal import instead of shaded import in TestSchemaToolCatalogOps.java

2023-03-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-27192:
--
Labels: pull-request-available  (was: )

> Use normal import instead of shaded import in TestSchemaToolCatalogOps.java
> ---
>
> Key: HIVE-27192
> URL: https://issues.apache.org/jira/browse/HIVE-27192
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltán Rátkai
>Assignee: Zoltán Rátkai
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-27192) Use normal import instead of shaded import in TestSchemaToolCatalogOps.java

2023-03-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27192?focusedWorklogId=853680=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853680
 ]

ASF GitHub Bot logged work on HIVE-27192:
-

Author: ASF GitHub Bot
Created on: 29/Mar/23 13:10
Start Date: 29/Mar/23 13:10
Worklog Time Spent: 10m 
  Work Description: zratkai opened a new pull request, #4169:
URL: https://github.com/apache/hive/pull/4169

   …olCatalogOps.java
   
   
   
   ### What changes were proposed in this pull request?
   
   Eliminated the shaded usage of com.google.common.io.Files;
   
   
   ### Why are the changes needed?
   We do not need shaded usage here. 
   
   
   ### Does this PR introduce _any_ user-facing change?
   No.
   
   
   ### How was this patch tested?
   Jenkins test.
   




Issue Time Tracking
---

Worklog Id: (was: 853680)
Remaining Estimate: 0h
Time Spent: 10m

> Use normal import instead of shaded import in TestSchemaToolCatalogOps.java
> ---
>
> Key: HIVE-27192
> URL: https://issues.apache.org/jira/browse/HIVE-27192
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltán Rátkai
>Assignee: Zoltán Rátkai
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26900) Error message not representing the correct line number with a syntax error in a HQL File

2023-03-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26900?focusedWorklogId=853678=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853678
 ]

ASF GitHub Bot logged work on HIVE-26900:
-

Author: ASF GitHub Bot
Created on: 29/Mar/23 13:01
Start Date: 29/Mar/23 13:01
Worklog Time Spent: 10m 
  Work Description: shreeyasand opened a new pull request, #4168:
URL: https://github.com/apache/hive/pull/4168

   …th a syntax error in a HQL File
   
   
   
   ### What changes were proposed in this pull request?
   
   In the Beeline class: 
   - A new method executeReader() has been introduced specifically to read hql 
files. It makes one string out of all the contents of the hql file separated by 
newline characters (the comments are excluded).
   
   In the Commands class:
   - Since handling multiple lines of query for hql files has already been 
addressed in the executeReader method, we limit the handleMultipleLineCmd() 
method to every other scenario besides when reading an hql file.
   
   In both Beeline.java and Commands.java:
   Trimming of the string/sql has been removed while reading hql file contents. 
This is achieved whenever getOpts().getScriptFile() equals null (ie this is for 
every situation except when reading an hql file). This is done so that the 
whitespaces and empty lines are not ignored while counting the line numbers.
   
   ### Why are the changes needed?
   
   Hive Cli throws error line number correctly when reading HQL files, but 
Beeline does not. These changes are needed so that the error line number is 
thrown correctly and there is no discrepancy between the functioning of Beeline 
and Hive Cli. 
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   Error message in Beeline was not representing the correct line number prior 
to the changes. Now Beeline prints the correct error line number.
   
   
   
   ### How was this patch tested?
   
   The testing was done locally on Beeline with multiple scenarios. The test 
were verified against the correctly functioning Hive Cli.
   As an example, for the given hql file:
   https://user-images.githubusercontent.com/50237152/222977016-e8a72f33-2f47-4ad4-aeff-2afb6f4a3bc9.png;>
   Error message prior to the changes:
   https://user-images.githubusercontent.com/50237152/222977044-90f746ee-1958-4c6a-9627-c1c1e2a173cc.png;>
   Error message after the changes:
   https://user-images.githubusercontent.com/50237152/222977064-d19b6bb8-b2bc-4292-a24a-1a14d04ab3eb.png;>




Issue Time Tracking
---

Worklog Id: (was: 853678)
Time Spent: 1h 10m  (was: 1h)

> Error message not representing the correct line number with a syntax error in 
> a HQL File
> 
>
> Key: HIVE-26900
> URL: https://issues.apache.org/jira/browse/HIVE-26900
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.2, 4.0.0-alpha-1, 4.0.0-alpha-2
>Reporter: Vikram Ahuja
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> When a wrong syntax is added in a HQL file, the error thrown by beeline while 
> running the HQL file is having the wrong line number.  The line number and 
> even the position is incorrect. Seems like parser is not considering spaces 
> and new lines and always throwing the error on line number 1 irrespective of 
> what line the error is on in the HQL file
>  
> For instance, consider the following test.hql file:
>  # --comment
>  # --comment
>  # SET hive.server2.logging.operation.enabled=true;
>  # SET hive.server2.logging.operation.level=VERBOSE;
>  # show tables;
>  #  
>  #  
>  #       CREATE TABLEE DUMMY;
>  
> when we call !run  test.hql in beeline or trigger ./beeline -u 
> jdbc:hive2://localhost:1 -f test.hql, The issue thrown by beeline is
> >>> CREATE TABLEE DUMMY;
> Error: Error while compiling statement: FAILED: ParseException line 1:7 
> cannot recongize input near 'CREATE' 'TABLEE' 'DUMMY' in ddl statement 
> (state=42000,code=4)
> The parser seems to be taking all the lines from 1 and is ignoring spaces in 
> the line.
> The error line in the parse exception is shown as 1:7 but it should have been 
> 8:13.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-27192) Use normal import instead of shaded import in TestSchemaToolCatalogOps.java

2023-03-29 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-27192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltán Rátkai reassigned HIVE-27192:



> Use normal import instead of shaded import in TestSchemaToolCatalogOps.java
> ---
>
> Key: HIVE-27192
> URL: https://issues.apache.org/jira/browse/HIVE-27192
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltán Rátkai
>Assignee: Zoltán Rátkai
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26900) Error message not representing the correct line number with a syntax error in a HQL File

2023-03-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26900?focusedWorklogId=853676=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853676
 ]

ASF GitHub Bot logged work on HIVE-26900:
-

Author: ASF GitHub Bot
Created on: 29/Mar/23 12:40
Start Date: 29/Mar/23 12:40
Worklog Time Spent: 10m 
  Work Description: shreeyasand closed pull request #4097: HIVE-26900: 
Error message not representing the correct line number wi…
URL: https://github.com/apache/hive/pull/4097




Issue Time Tracking
---

Worklog Id: (was: 853676)
Time Spent: 1h  (was: 50m)

> Error message not representing the correct line number with a syntax error in 
> a HQL File
> 
>
> Key: HIVE-26900
> URL: https://issues.apache.org/jira/browse/HIVE-26900
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.2, 4.0.0-alpha-1, 4.0.0-alpha-2
>Reporter: Vikram Ahuja
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> When a wrong syntax is added in a HQL file, the error thrown by beeline while 
> running the HQL file is having the wrong line number.  The line number and 
> even the position is incorrect. Seems like parser is not considering spaces 
> and new lines and always throwing the error on line number 1 irrespective of 
> what line the error is on in the HQL file
>  
> For instance, consider the following test.hql file:
>  # --comment
>  # --comment
>  # SET hive.server2.logging.operation.enabled=true;
>  # SET hive.server2.logging.operation.level=VERBOSE;
>  # show tables;
>  #  
>  #  
>  #       CREATE TABLEE DUMMY;
>  
> when we call !run  test.hql in beeline or trigger ./beeline -u 
> jdbc:hive2://localhost:1 -f test.hql, The issue thrown by beeline is
> >>> CREATE TABLEE DUMMY;
> Error: Error while compiling statement: FAILED: ParseException line 1:7 
> cannot recongize input near 'CREATE' 'TABLEE' 'DUMMY' in ddl statement 
> (state=42000,code=4)
> The parser seems to be taking all the lines from 1 and is ignoring spaces in 
> the line.
> The error line in the parse exception is shown as 1:7 but it should have been 
> 8:13.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26900) Error message not representing the correct line number with a syntax error in a HQL File

2023-03-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26900?focusedWorklogId=853672=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853672
 ]

ASF GitHub Bot logged work on HIVE-26900:
-

Author: ASF GitHub Bot
Created on: 29/Mar/23 12:38
Start Date: 29/Mar/23 12:38
Worklog Time Spent: 10m 
  Work Description: shreeyasand opened a new pull request, #4097:
URL: https://github.com/apache/hive/pull/4097

   …th a syntax error in a HQL File
   
   ### What changes were proposed in this pull request?
   
   In the Beeline class: 
   - the execute method (at line 1362) has been modified to make one string out 
of all the contents of the hql file separated by newline characters (the 
comments are excluded).
   - if the final string is null, the code exits the while loop (it implies 
that there is no command to be executed).
   
   In both the classes (Beeline and Commands), the trim() method has been 
removed from a few places. This is done so that the whitespaces and empty lines 
are not ignored while counting the line numbers.
   
   
   ### Why are the changes needed?
   
   Hive Cli throws error line number correctly when reading HQL files, but 
Beeline does not. These changes are needed so that the error line number is 
thrown correctly and there is no discrepancy between the functioning of Beeline 
and Hive Cli. 
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   Error message in Beeline was not representing the correct line number prior 
to the changes. Now Beeline prints the correct error line number.
   
   
   
   ### How was this patch tested?
   
   The testing was done locally on Beeline with multiple scenarios. The test 
were verified against the correctly functioning Hive Cli.
   As an example, for the given hql file:
   https://user-images.githubusercontent.com/50237152/222977016-e8a72f33-2f47-4ad4-aeff-2afb6f4a3bc9.png;>
   Error message prior to the changes:
   https://user-images.githubusercontent.com/50237152/222977044-90f746ee-1958-4c6a-9627-c1c1e2a173cc.png;>
   Error message after the changes:
   https://user-images.githubusercontent.com/50237152/222977064-d19b6bb8-b2bc-4292-a24a-1a14d04ab3eb.png;>
   
   
   




Issue Time Tracking
---

Worklog Id: (was: 853672)
Time Spent: 50m  (was: 40m)

> Error message not representing the correct line number with a syntax error in 
> a HQL File
> 
>
> Key: HIVE-26900
> URL: https://issues.apache.org/jira/browse/HIVE-26900
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.2, 4.0.0-alpha-1, 4.0.0-alpha-2
>Reporter: Vikram Ahuja
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> When a wrong syntax is added in a HQL file, the error thrown by beeline while 
> running the HQL file is having the wrong line number.  The line number and 
> even the position is incorrect. Seems like parser is not considering spaces 
> and new lines and always throwing the error on line number 1 irrespective of 
> what line the error is on in the HQL file
>  
> For instance, consider the following test.hql file:
>  # --comment
>  # --comment
>  # SET hive.server2.logging.operation.enabled=true;
>  # SET hive.server2.logging.operation.level=VERBOSE;
>  # show tables;
>  #  
>  #  
>  #       CREATE TABLEE DUMMY;
>  
> when we call !run  test.hql in beeline or trigger ./beeline -u 
> jdbc:hive2://localhost:1 -f test.hql, The issue thrown by beeline is
> >>> CREATE TABLEE DUMMY;
> Error: Error while compiling statement: FAILED: ParseException line 1:7 
> cannot recongize input near 'CREATE' 'TABLEE' 'DUMMY' in ddl statement 
> (state=42000,code=4)
> The parser seems to be taking all the lines from 1 and is ignoring spaces in 
> the line.
> The error line in the parse exception is shown as 1:7 but it should have been 
> 8:13.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-27187) Incremental rebuild of materialized view having aggregate and stored by iceberg

2023-03-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27187?focusedWorklogId=853666=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853666
 ]

ASF GitHub Bot logged work on HIVE-27187:
-

Author: ASF GitHub Bot
Created on: 29/Mar/23 12:01
Start Date: 29/Mar/23 12:01
Worklog Time Spent: 10m 
  Work Description: sonarcloud[bot] commented on PR #4166:
URL: https://github.com/apache/hive/pull/4166#issuecomment-1488471135

   Kudos, SonarCloud Quality Gate passed!  [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive=4166)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=4166=false=BUG)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4166=false=BUG)
 [0 
Bugs](https://sonarcloud.io/project/issues?id=apache_hive=4166=false=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=4166=false=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4166=false=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=4166=false=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4166=false=SECURITY_HOTSPOT)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4166=false=SECURITY_HOTSPOT)
 [0 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4166=false=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive=4166=false=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4166=false=CODE_SMELL)
 [4 Code 
Smells](https://sonarcloud.io/project/issues?id=apache_hive=4166=false=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4166=coverage=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4166=duplicated_lines_density=list)
 No Duplication information
   
   




Issue Time Tracking
---

Worklog Id: (was: 853666)
Time Spent: 0.5h  (was: 20m)

> Incremental rebuild of materialized view having aggregate and stored by 
> iceberg
> ---
>
> Key: HIVE-27187
> URL: https://issues.apache.org/jira/browse/HIVE-27187
> Project: Hive
>  Issue Type: Improvement
>  Components: Iceberg integration, Materialized views
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently incremental rebuild of materialized view stored by iceberg which 
> definition query contains aggregate operator is transformed to an insert 
> overwrite statement which contains a union operator if the source tables 
> contains insert operations only. One branch of the union scans the view the 
> other produces the delta.
> This can be improved further: transform the statement to a multi insert 
> statement representing a merge statement to insert new aggregations and 
> update existing.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-27189) Remove duplicate info log in Hive.isSubDIr

2023-03-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27189?focusedWorklogId=853654=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853654
 ]

ASF GitHub Bot logged work on HIVE-27189:
-

Author: ASF GitHub Bot
Created on: 29/Mar/23 10:56
Start Date: 29/Mar/23 10:56
Worklog Time Spent: 10m 
  Work Description: sonarcloud[bot] commented on PR #4167:
URL: https://github.com/apache/hive/pull/4167#issuecomment-1488378213

   Kudos, SonarCloud Quality Gate passed!  [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive=4167)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=4167=false=BUG)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4167=false=BUG)
 [0 
Bugs](https://sonarcloud.io/project/issues?id=apache_hive=4167=false=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=4167=false=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4167=false=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=4167=false=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4167=false=SECURITY_HOTSPOT)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4167=false=SECURITY_HOTSPOT)
 [0 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4167=false=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive=4167=false=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4167=false=CODE_SMELL)
 [1 Code 
Smell](https://sonarcloud.io/project/issues?id=apache_hive=4167=false=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4167=coverage=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4167=duplicated_lines_density=list)
 No Duplication information
   
   




Issue Time Tracking
---

Worklog Id: (was: 853654)
Time Spent: 20m  (was: 10m)

> Remove duplicate info log in Hive.isSubDIr
> --
>
> Key: HIVE-27189
> URL: https://issues.apache.org/jira/browse/HIVE-27189
> Project: Hive
>  Issue Type: Improvement
>Reporter: shuyouZZ
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> In class {{{}org.apache.hadoop.hive.ql.metadata.HIve{}}}, invoke method 
> {{isSubDir}} will print twice
> {code:java}
> LOG.debug("The source path is " + fullF1 + " and the destination path is " + 
> fullF2);{code}
> we should remove the duplicate info log.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27189) Remove duplicate info log in Hive.isSubDIr

2023-03-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-27189:
--
Labels: pull-request-available  (was: )

> Remove duplicate info log in Hive.isSubDIr
> --
>
> Key: HIVE-27189
> URL: https://issues.apache.org/jira/browse/HIVE-27189
> Project: Hive
>  Issue Type: Improvement
>Reporter: shuyouZZ
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In class {{{}org.apache.hadoop.hive.ql.metadata.HIve{}}}, invoke method 
> {{isSubDir}} will print twice
> {code:java}
> LOG.debug("The source path is " + fullF1 + " and the destination path is " + 
> fullF2);{code}
> we should remove the duplicate info log.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-27191) Cleaner is blocked by orphaned entries in MHL table

2023-03-29 Thread Simhadri Govindappa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simhadri Govindappa reassigned HIVE-27191:
--


> Cleaner is blocked by orphaned entries in MHL table
> ---
>
> Key: HIVE-27191
> URL: https://issues.apache.org/jira/browse/HIVE-27191
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri Govindappa
>Assignee: Simhadri Govindappa
>Priority: Major
>
> The following  mhl_txnids do not exist in TXNS table, as a result, the 
> cleaner gets blocked and many entries are stuck in the ready-for-cleaning 
> state. 
> The cleaner should periodically check for such entries and remove them from 
> MHL_TABLE to prevent the cleaner from being blocked.
> {noformat}
> postgres=# select mhl_txnid from min_history_level where not exists (select 1 
> from txns where txn_id = mhl_txnid);
>  mhl_txnid
> ---
>   43708080
>   43708088
>   43679962
>   43680464
>   43680352
>   43680392
>   43680424
>   43680436
>   43680471
>   43680475
>   43680483
>   43622677
>   43708083
>   43708084
>   43678157
>   43680482
>   43680484
>   43622745
>   43622750
>   43706829
>   43707261
> (21 rows){noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-27189) Remove duplicate info log in Hive.isSubDIr

2023-03-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27189?focusedWorklogId=853644=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853644
 ]

ASF GitHub Bot logged work on HIVE-27189:
-

Author: ASF GitHub Bot
Created on: 29/Mar/23 10:10
Start Date: 29/Mar/23 10:10
Worklog Time Spent: 10m 
  Work Description: shuyouZZ opened a new pull request, #4167:
URL: https://github.com/apache/hive/pull/4167

   ### What changes were proposed in this pull request?
   Remove duplicate info log in Hive.isSubDIr
   
   
   ### Why are the changes needed?
   In class org.apache.hadoop.hive.ql.metadata.HIve, invoke method isSubDir 
will print twice
   
   `LOG.debug("The source path is " + fullF1 + " and the destination path is " 
+ fullF2);`
   
   we should remove the duplicate info log.
   
   Below is a example in log file,
   `23/03/27 05:11:08 INFO Hive: New loading path = 
hdfs://R2/projects/sellerdata_ods/hive/sellerdata_ods/ods_shopee_order_detail_v8_di_ab_test/.hive-staging_hive_2023-03-27_05-09-17_848_8941157515106120269-1/-ext-1/dt=20230327/country=VN
 with partSpec {dt=20230327, country=VN}
   23/03/27 05:11:08 DEBUG Hive: The source path is 
/projects/sellerdata_ods/hive/sellerdata_ods/ods_shopee_order_detail_v8_di_ab_test/.hive-staging_hive_2023-03-27_05-09-17_848_8941157515106120269-1/-ext-1/dt=20230327/country=SG/part-00179-51723c84-5be6-428e-862e-c4716e9536cc.c000/
 and the destination path is 
/projects/sellerdata_ods/hive/sellerdata_ods/ods_shopee_order_detail_v8_di_ab_test/dt=20230327/country=SG/part-00179-51723c84-5be6-428e-862e-c4716e9536cc.c000/
   23/03/27 05:11:08 DEBUG Hive: The source path is 
/projects/sellerdata_ods/hive/sellerdata_ods/ods_shopee_order_detail_v8_di_ab_test/.hive-staging_hive_2023-03-27_05-09-17_848_8941157515106120269-1/-ext-1/dt=20230327/country=SG/part-00179-51723c84-5be6-428e-862e-c4716e9536cc.c000/
 and the destination path is 
/projects/sellerdata_ods/hive/sellerdata_ods/ods_shopee_order_detail_v8_di_ab_test/dt=20230327/country=SG/part-00179-51723c84-5be6-428e-862e-c4716e9536cc.c000/
   23/03/27 05:11:08 INFO Hive: Renaming src: 
hdfs://R2/projects/sellerdata_ods/hive/sellerdata_ods/ods_shopee_order_detail_v8_di_ab_test/.hive-staging_hive_2023-03-27_05-09-17_848_8941157515106120269-1/-ext-1/dt=20230327/country=SG/part-00179-51723c84-5be6-428e-862e-c4716e9536cc.c000,
 dest: 
hdfs://R2/projects/sellerdata_ods/hive/sellerdata_ods/ods_shopee_order_detail_v8_di_ab_test/dt=20230327/country=SG/part-00179-51723c84-5be6-428e-862e-c4716e9536cc.c000,
 Status:true
   23/03/27 05:11:09 DEBUG Hive: altering partition for table 
ods_shopee_order_detail_v8_di_ab_test with partition spec : {dt=20230327, 
country=SG}
   `
   
   
   ### Does this PR introduce _any_ user-facing change?
   NO
   
   
   ### How was this patch tested?
   No need
   




Issue Time Tracking
---

Worklog Id: (was: 853644)
Remaining Estimate: 0h
Time Spent: 10m

> Remove duplicate info log in Hive.isSubDIr
> --
>
> Key: HIVE-27189
> URL: https://issues.apache.org/jira/browse/HIVE-27189
> Project: Hive
>  Issue Type: Improvement
>Reporter: shuyouZZ
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In class {{{}org.apache.hadoop.hive.ql.metadata.HIve{}}}, invoke method 
> {{isSubDir}} will print twice
> {code:java}
> LOG.debug("The source path is " + fullF1 + " and the destination path is " + 
> fullF2);{code}
> we should remove the duplicate info log.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-27190) Implement col stats cache for hive iceberg table

2023-03-29 Thread Simhadri Govindappa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simhadri Govindappa reassigned HIVE-27190:
--

Assignee: Simhadri Govindappa

> Implement  col stats cache for hive iceberg table
> -
>
> Key: HIVE-27190
> URL: https://issues.apache.org/jira/browse/HIVE-27190
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri Govindappa
>Assignee: Simhadri Govindappa
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26809) Upgrade ORC to 1.8.3

2023-03-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26809?focusedWorklogId=853643=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853643
 ]

ASF GitHub Bot logged work on HIVE-26809:
-

Author: ASF GitHub Bot
Created on: 29/Mar/23 10:03
Start Date: 29/Mar/23 10:03
Worklog Time Spent: 10m 
  Work Description: sonarcloud[bot] commented on PR #4121:
URL: https://github.com/apache/hive/pull/4121#issuecomment-1488304670

   Kudos, SonarCloud Quality Gate passed!  [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive=4121)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=4121=false=BUG)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4121=false=BUG)
 [0 
Bugs](https://sonarcloud.io/project/issues?id=apache_hive=4121=false=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=4121=false=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4121=false=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=4121=false=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4121=false=SECURITY_HOTSPOT)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4121=false=SECURITY_HOTSPOT)
 [0 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4121=false=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive=4121=false=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4121=false=CODE_SMELL)
 [2 Code 
Smells](https://sonarcloud.io/project/issues?id=apache_hive=4121=false=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4121=coverage=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4121=duplicated_lines_density=list)
 No Duplication information
   
   




Issue Time Tracking
---

Worklog Id: (was: 853643)
Time Spent: 8.5h  (was: 8h 20m)

> Upgrade ORC to 1.8.3
> 
>
> Key: HIVE-26809
> URL: https://issues.apache.org/jira/browse/HIVE-26809
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Dmitriy Fingerman
>Assignee: Zoltán Rátkai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 8.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-27135) AcidUtils#getHdfsDirSnapshots() throws FNFE when a directory is removed in HDFS

2023-03-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27135?focusedWorklogId=853642=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853642
 ]

ASF GitHub Bot logged work on HIVE-27135:
-

Author: ASF GitHub Bot
Created on: 29/Mar/23 10:00
Start Date: 29/Mar/23 10:00
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #4114:
URL: https://github.com/apache/hive/pull/4114#discussion_r1151692776


##
ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java:
##
@@ -1538,32 +1538,36 @@ private static HdfsDirSnapshot addToSnapshot(Map dirToSna
   public static Map getHdfsDirSnapshots(final 
FileSystem fs, final Path path)
   throws IOException {
 Map dirToSnapshots = new HashMap<>();
-RemoteIterator itr = FileUtils.listFiles(fs, path, 
true, acidHiddenFileFilter);
-while (itr.hasNext()) {
-  FileStatus fStatus = itr.next();
-  Path fPath = fStatus.getPath();
-  if (fStatus.isDirectory() && acidTempDirFilter.accept(fPath)) {
-addToSnapshot(dirToSnapshots, fPath);
-  } else {
-Path parentDirPath = fPath.getParent();
-if (acidTempDirFilter.accept(parentDirPath)) {
-  while (isChildOfDelta(parentDirPath, path)) {
-// Some cases there are other directory layers between the delta 
and the datafiles
-// (export-import mm table, insert with union all to mm table, 
skewed tables).
-// But it does not matter for the AcidState, we just need the 
deltas and the data files
-// So build the snapshot with the files inside the delta directory
-parentDirPath = parentDirPath.getParent();
-  }
-  HdfsDirSnapshot dirSnapshot = addToSnapshot(dirToSnapshots, 
parentDirPath);
-  // We're not filtering out the metadata file and acid format file,
-  // as they represent parts of a valid snapshot
-  // We're not using the cached values downstream, but we can 
potentially optimize more in a follow-up task
-  if 
(fStatus.getPath().toString().contains(MetaDataFile.METADATA_FILE)) {
-dirSnapshot.addMetadataFile(fStatus);
-  } else if 
(fStatus.getPath().toString().contains(OrcAcidVersion.ACID_FORMAT)) {
-dirSnapshot.addOrcAcidFormatFile(fStatus);
-  } else {
-dirSnapshot.addFile(fStatus);
+Deque> stack = new ArrayDeque<>();

Review Comment:
   please update the PR description





Issue Time Tracking
---

Worklog Id: (was: 853642)
Time Spent: 6h  (was: 5h 50m)

> AcidUtils#getHdfsDirSnapshots() throws FNFE when a directory is removed in 
> HDFS
> ---
>
> Key: HIVE-27135
> URL: https://issues.apache.org/jira/browse/HIVE-27135
> Project: Hive
>  Issue Type: Bug
>Reporter: Dayakar M
>Assignee: Dayakar M
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> AcidUtils#getHdfsDirSnapshots() throws FileNotFoundException when a directory 
> is removed in HDFS while fetching HDFS Snapshots.
> Below testcode can be used to reproduce this issue.
> {code:java}
>  @Test
>   public void 
> testShouldNotThrowFNFEWhenHiveStagingDirectoryIsRemovedWhileFetchingHDFSSnapshots()
>  throws Exception {
> MockFileSystem fs = new MockFileSystem(new HiveConf(),
> new MockFile("mock:/tbl/part1/.hive-staging_dir/-ext-10002", 500, new 
> byte[0]),
> new MockFile("mock:/tbl/part2/.hive-staging_dir", 500, new byte[0]),
> new MockFile("mock:/tbl/part1/_tmp_space.db", 500, new byte[0]),
> new MockFile("mock:/tbl/part1/delta_1_1/bucket--", 500, new 
> byte[0]));
> Path path = new MockPath(fs, "/tbl");
> Path stageDir = new MockPath(fs, "mock:/tbl/part1/.hive-staging_dir");
> FileSystem mockFs = spy(fs);
> Mockito.doThrow(new 
> FileNotFoundException("")).when(mockFs).listLocatedStatus(eq(stageDir));
> try {
>   Map hdfsDirSnapshots = 
> AcidUtils.getHdfsDirSnapshots(mockFs, path);
>   Assert.assertEquals(1, hdfsDirSnapshots.size());
> }
> catch (FileNotFoundException fnf) {
>   fail("Should not throw FileNotFoundException when a directory is 
> removed while fetching HDFSSnapshots");
> }
>   }{code}
> This issue got fixed as a part of HIVE-26481 but here its not fixed 
> completely. 
> [Here|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java#L1541]
>  FileUtils.listFiles() API which returns a RemoteIterator. 
> So while iterating over, it checks if it is a directory and recursive listing 
> then it will try to list files from that directory but if that directory is 
> removed by other thread/task then it throws 

[jira] [Work logged] (HIVE-27135) AcidUtils#getHdfsDirSnapshots() throws FNFE when a directory is removed in HDFS

2023-03-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27135?focusedWorklogId=853641=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853641
 ]

ASF GitHub Bot logged work on HIVE-27135:
-

Author: ASF GitHub Bot
Created on: 29/Mar/23 09:58
Start Date: 29/Mar/23 09:58
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #4114:
URL: https://github.com/apache/hive/pull/4114#discussion_r1151683610


##
ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java:
##
@@ -1538,32 +1538,36 @@ private static HdfsDirSnapshot addToSnapshot(Map dirToSna
   public static Map getHdfsDirSnapshots(final 
FileSystem fs, final Path path)
   throws IOException {
 Map dirToSnapshots = new HashMap<>();
-RemoteIterator itr = FileUtils.listFiles(fs, path, 
true, acidHiddenFileFilter);
-while (itr.hasNext()) {
-  FileStatus fStatus = itr.next();
-  Path fPath = fStatus.getPath();
-  if (fStatus.isDirectory() && acidTempDirFilter.accept(fPath)) {
-addToSnapshot(dirToSnapshots, fPath);
-  } else {
-Path parentDirPath = fPath.getParent();
-if (acidTempDirFilter.accept(parentDirPath)) {
-  while (isChildOfDelta(parentDirPath, path)) {
-// Some cases there are other directory layers between the delta 
and the datafiles
-// (export-import mm table, insert with union all to mm table, 
skewed tables).
-// But it does not matter for the AcidState, we just need the 
deltas and the data files
-// So build the snapshot with the files inside the delta directory
-parentDirPath = parentDirPath.getParent();
-  }
-  HdfsDirSnapshot dirSnapshot = addToSnapshot(dirToSnapshots, 
parentDirPath);
-  // We're not filtering out the metadata file and acid format file,
-  // as they represent parts of a valid snapshot
-  // We're not using the cached values downstream, but we can 
potentially optimize more in a follow-up task
-  if 
(fStatus.getPath().toString().contains(MetaDataFile.METADATA_FILE)) {
-dirSnapshot.addMetadataFile(fStatus);
-  } else if 
(fStatus.getPath().toString().contains(OrcAcidVersion.ACID_FORMAT)) {
-dirSnapshot.addOrcAcidFormatFile(fStatus);
-  } else {
-dirSnapshot.addFile(fStatus);
+Deque> stack = new ArrayDeque<>();
+stack.push(FileUtils.listLocatedStatusIterator(fs, path, 
acidHiddenFileFilter));
+while (!stack.isEmpty()) {
+  RemoteIterator itr = stack.pop();
+  while (itr.hasNext()) {
+FileStatus fStatus = itr.next();
+Path fPath = fStatus.getPath();
+if (fStatus.isDirectory()) {
+  stack.push(FileUtils.listLocatedStatusIterator(fs, fPath, 
acidHiddenFileFilter));

Review Comment:
   what if the folder is empty? that was previously included
   
   if (fStatus.isDirectory() && acidTempDirFilter.accept(fPath)) {
   addToSnapshot(dirToSnapshots, fPath);
   





Issue Time Tracking
---

Worklog Id: (was: 853641)
Time Spent: 5h 50m  (was: 5h 40m)

> AcidUtils#getHdfsDirSnapshots() throws FNFE when a directory is removed in 
> HDFS
> ---
>
> Key: HIVE-27135
> URL: https://issues.apache.org/jira/browse/HIVE-27135
> Project: Hive
>  Issue Type: Bug
>Reporter: Dayakar M
>Assignee: Dayakar M
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> AcidUtils#getHdfsDirSnapshots() throws FileNotFoundException when a directory 
> is removed in HDFS while fetching HDFS Snapshots.
> Below testcode can be used to reproduce this issue.
> {code:java}
>  @Test
>   public void 
> testShouldNotThrowFNFEWhenHiveStagingDirectoryIsRemovedWhileFetchingHDFSSnapshots()
>  throws Exception {
> MockFileSystem fs = new MockFileSystem(new HiveConf(),
> new MockFile("mock:/tbl/part1/.hive-staging_dir/-ext-10002", 500, new 
> byte[0]),
> new MockFile("mock:/tbl/part2/.hive-staging_dir", 500, new byte[0]),
> new MockFile("mock:/tbl/part1/_tmp_space.db", 500, new byte[0]),
> new MockFile("mock:/tbl/part1/delta_1_1/bucket--", 500, new 
> byte[0]));
> Path path = new MockPath(fs, "/tbl");
> Path stageDir = new MockPath(fs, "mock:/tbl/part1/.hive-staging_dir");
> FileSystem mockFs = spy(fs);
> Mockito.doThrow(new 
> FileNotFoundException("")).when(mockFs).listLocatedStatus(eq(stageDir));
> try {
>   Map hdfsDirSnapshots = 
> AcidUtils.getHdfsDirSnapshots(mockFs, path);
>   Assert.assertEquals(1, hdfsDirSnapshots.size());
> }
> catch (FileNotFoundException fnf) {
>   fail("Should not throw 

[jira] [Work logged] (HIVE-27135) AcidUtils#getHdfsDirSnapshots() throws FNFE when a directory is removed in HDFS

2023-03-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27135?focusedWorklogId=853640=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853640
 ]

ASF GitHub Bot logged work on HIVE-27135:
-

Author: ASF GitHub Bot
Created on: 29/Mar/23 09:55
Start Date: 29/Mar/23 09:55
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #4114:
URL: https://github.com/apache/hive/pull/4114#discussion_r1151684557


##
ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java:
##
@@ -1538,32 +1538,36 @@ private static HdfsDirSnapshot addToSnapshot(Map dirToSna
   public static Map getHdfsDirSnapshots(final 
FileSystem fs, final Path path)
   throws IOException {
 Map dirToSnapshots = new HashMap<>();
-RemoteIterator itr = FileUtils.listFiles(fs, path, 
true, acidHiddenFileFilter);
-while (itr.hasNext()) {
-  FileStatus fStatus = itr.next();
-  Path fPath = fStatus.getPath();
-  if (fStatus.isDirectory() && acidTempDirFilter.accept(fPath)) {
-addToSnapshot(dirToSnapshots, fPath);
-  } else {
-Path parentDirPath = fPath.getParent();
-if (acidTempDirFilter.accept(parentDirPath)) {
-  while (isChildOfDelta(parentDirPath, path)) {
-// Some cases there are other directory layers between the delta 
and the datafiles
-// (export-import mm table, insert with union all to mm table, 
skewed tables).
-// But it does not matter for the AcidState, we just need the 
deltas and the data files
-// So build the snapshot with the files inside the delta directory
-parentDirPath = parentDirPath.getParent();
-  }
-  HdfsDirSnapshot dirSnapshot = addToSnapshot(dirToSnapshots, 
parentDirPath);
-  // We're not filtering out the metadata file and acid format file,
-  // as they represent parts of a valid snapshot
-  // We're not using the cached values downstream, but we can 
potentially optimize more in a follow-up task
-  if 
(fStatus.getPath().toString().contains(MetaDataFile.METADATA_FILE)) {
-dirSnapshot.addMetadataFile(fStatus);
-  } else if 
(fStatus.getPath().toString().contains(OrcAcidVersion.ACID_FORMAT)) {
-dirSnapshot.addOrcAcidFormatFile(fStatus);
-  } else {
-dirSnapshot.addFile(fStatus);
+Deque> stack = new ArrayDeque<>();
+stack.push(FileUtils.listLocatedStatusIterator(fs, path, 
acidHiddenFileFilter));
+while (!stack.isEmpty()) {
+  RemoteIterator itr = stack.pop();
+  while (itr.hasNext()) {
+FileStatus fStatus = itr.next();
+Path fPath = fStatus.getPath();
+if (fStatus.isDirectory()) {
+  stack.push(FileUtils.listLocatedStatusIterator(fs, fPath, 
acidHiddenFileFilter));
+} else {
+  Path parentDirPath = fPath.getParent();
+  if (acidTempDirFilter.accept(parentDirPath)) {

Review Comment:
    





Issue Time Tracking
---

Worklog Id: (was: 853640)
Time Spent: 5h 40m  (was: 5.5h)

> AcidUtils#getHdfsDirSnapshots() throws FNFE when a directory is removed in 
> HDFS
> ---
>
> Key: HIVE-27135
> URL: https://issues.apache.org/jira/browse/HIVE-27135
> Project: Hive
>  Issue Type: Bug
>Reporter: Dayakar M
>Assignee: Dayakar M
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> AcidUtils#getHdfsDirSnapshots() throws FileNotFoundException when a directory 
> is removed in HDFS while fetching HDFS Snapshots.
> Below testcode can be used to reproduce this issue.
> {code:java}
>  @Test
>   public void 
> testShouldNotThrowFNFEWhenHiveStagingDirectoryIsRemovedWhileFetchingHDFSSnapshots()
>  throws Exception {
> MockFileSystem fs = new MockFileSystem(new HiveConf(),
> new MockFile("mock:/tbl/part1/.hive-staging_dir/-ext-10002", 500, new 
> byte[0]),
> new MockFile("mock:/tbl/part2/.hive-staging_dir", 500, new byte[0]),
> new MockFile("mock:/tbl/part1/_tmp_space.db", 500, new byte[0]),
> new MockFile("mock:/tbl/part1/delta_1_1/bucket--", 500, new 
> byte[0]));
> Path path = new MockPath(fs, "/tbl");
> Path stageDir = new MockPath(fs, "mock:/tbl/part1/.hive-staging_dir");
> FileSystem mockFs = spy(fs);
> Mockito.doThrow(new 
> FileNotFoundException("")).when(mockFs).listLocatedStatus(eq(stageDir));
> try {
>   Map hdfsDirSnapshots = 
> AcidUtils.getHdfsDirSnapshots(mockFs, path);
>   Assert.assertEquals(1, hdfsDirSnapshots.size());
> }
> catch (FileNotFoundException fnf) {
>   fail("Should not throw FileNotFoundException when a directory is 
> removed while 

[jira] [Work logged] (HIVE-27135) AcidUtils#getHdfsDirSnapshots() throws FNFE when a directory is removed in HDFS

2023-03-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27135?focusedWorklogId=853639=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853639
 ]

ASF GitHub Bot logged work on HIVE-27135:
-

Author: ASF GitHub Bot
Created on: 29/Mar/23 09:54
Start Date: 29/Mar/23 09:54
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #4114:
URL: https://github.com/apache/hive/pull/4114#discussion_r1151683610


##
ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java:
##
@@ -1538,32 +1538,36 @@ private static HdfsDirSnapshot addToSnapshot(Map dirToSna
   public static Map getHdfsDirSnapshots(final 
FileSystem fs, final Path path)
   throws IOException {
 Map dirToSnapshots = new HashMap<>();
-RemoteIterator itr = FileUtils.listFiles(fs, path, 
true, acidHiddenFileFilter);
-while (itr.hasNext()) {
-  FileStatus fStatus = itr.next();
-  Path fPath = fStatus.getPath();
-  if (fStatus.isDirectory() && acidTempDirFilter.accept(fPath)) {
-addToSnapshot(dirToSnapshots, fPath);
-  } else {
-Path parentDirPath = fPath.getParent();
-if (acidTempDirFilter.accept(parentDirPath)) {
-  while (isChildOfDelta(parentDirPath, path)) {
-// Some cases there are other directory layers between the delta 
and the datafiles
-// (export-import mm table, insert with union all to mm table, 
skewed tables).
-// But it does not matter for the AcidState, we just need the 
deltas and the data files
-// So build the snapshot with the files inside the delta directory
-parentDirPath = parentDirPath.getParent();
-  }
-  HdfsDirSnapshot dirSnapshot = addToSnapshot(dirToSnapshots, 
parentDirPath);
-  // We're not filtering out the metadata file and acid format file,
-  // as they represent parts of a valid snapshot
-  // We're not using the cached values downstream, but we can 
potentially optimize more in a follow-up task
-  if 
(fStatus.getPath().toString().contains(MetaDataFile.METADATA_FILE)) {
-dirSnapshot.addMetadataFile(fStatus);
-  } else if 
(fStatus.getPath().toString().contains(OrcAcidVersion.ACID_FORMAT)) {
-dirSnapshot.addOrcAcidFormatFile(fStatus);
-  } else {
-dirSnapshot.addFile(fStatus);
+Deque> stack = new ArrayDeque<>();
+stack.push(FileUtils.listLocatedStatusIterator(fs, path, 
acidHiddenFileFilter));
+while (!stack.isEmpty()) {
+  RemoteIterator itr = stack.pop();
+  while (itr.hasNext()) {
+FileStatus fStatus = itr.next();
+Path fPath = fStatus.getPath();
+if (fStatus.isDirectory()) {
+  stack.push(FileUtils.listLocatedStatusIterator(fs, fPath, 
acidHiddenFileFilter));

Review Comment:
   what if the folder is empty? that was previously included





Issue Time Tracking
---

Worklog Id: (was: 853639)
Time Spent: 5.5h  (was: 5h 20m)

> AcidUtils#getHdfsDirSnapshots() throws FNFE when a directory is removed in 
> HDFS
> ---
>
> Key: HIVE-27135
> URL: https://issues.apache.org/jira/browse/HIVE-27135
> Project: Hive
>  Issue Type: Bug
>Reporter: Dayakar M
>Assignee: Dayakar M
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> AcidUtils#getHdfsDirSnapshots() throws FileNotFoundException when a directory 
> is removed in HDFS while fetching HDFS Snapshots.
> Below testcode can be used to reproduce this issue.
> {code:java}
>  @Test
>   public void 
> testShouldNotThrowFNFEWhenHiveStagingDirectoryIsRemovedWhileFetchingHDFSSnapshots()
>  throws Exception {
> MockFileSystem fs = new MockFileSystem(new HiveConf(),
> new MockFile("mock:/tbl/part1/.hive-staging_dir/-ext-10002", 500, new 
> byte[0]),
> new MockFile("mock:/tbl/part2/.hive-staging_dir", 500, new byte[0]),
> new MockFile("mock:/tbl/part1/_tmp_space.db", 500, new byte[0]),
> new MockFile("mock:/tbl/part1/delta_1_1/bucket--", 500, new 
> byte[0]));
> Path path = new MockPath(fs, "/tbl");
> Path stageDir = new MockPath(fs, "mock:/tbl/part1/.hive-staging_dir");
> FileSystem mockFs = spy(fs);
> Mockito.doThrow(new 
> FileNotFoundException("")).when(mockFs).listLocatedStatus(eq(stageDir));
> try {
>   Map hdfsDirSnapshots = 
> AcidUtils.getHdfsDirSnapshots(mockFs, path);
>   Assert.assertEquals(1, hdfsDirSnapshots.size());
> }
> catch (FileNotFoundException fnf) {
>   fail("Should not throw FileNotFoundException when a directory is 
> removed while fetching HDFSSnapshots");
> }
>   }{code}
> This issue got fixed as a part of 

[jira] [Work logged] (HIVE-27135) AcidUtils#getHdfsDirSnapshots() throws FNFE when a directory is removed in HDFS

2023-03-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27135?focusedWorklogId=853638=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853638
 ]

ASF GitHub Bot logged work on HIVE-27135:
-

Author: ASF GitHub Bot
Created on: 29/Mar/23 09:51
Start Date: 29/Mar/23 09:51
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #4114:
URL: https://github.com/apache/hive/pull/4114#discussion_r1151679605


##
common/src/java/org/apache/hadoop/hive/common/FileUtils.java:
##
@@ -1376,6 +1376,12 @@ public static RemoteIterator 
listStatusIterator(FileSystem fs, Path
 status -> filter.accept(status.getPath()));
   }
 
+  public static RemoteIterator 
listLocatedStatusIterator(FileSystem fs, Path path, PathFilter filter)

Review Comment:
   it's not required, FileStatus object is enough





Issue Time Tracking
---

Worklog Id: (was: 853638)
Time Spent: 5h 20m  (was: 5h 10m)

> AcidUtils#getHdfsDirSnapshots() throws FNFE when a directory is removed in 
> HDFS
> ---
>
> Key: HIVE-27135
> URL: https://issues.apache.org/jira/browse/HIVE-27135
> Project: Hive
>  Issue Type: Bug
>Reporter: Dayakar M
>Assignee: Dayakar M
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> AcidUtils#getHdfsDirSnapshots() throws FileNotFoundException when a directory 
> is removed in HDFS while fetching HDFS Snapshots.
> Below testcode can be used to reproduce this issue.
> {code:java}
>  @Test
>   public void 
> testShouldNotThrowFNFEWhenHiveStagingDirectoryIsRemovedWhileFetchingHDFSSnapshots()
>  throws Exception {
> MockFileSystem fs = new MockFileSystem(new HiveConf(),
> new MockFile("mock:/tbl/part1/.hive-staging_dir/-ext-10002", 500, new 
> byte[0]),
> new MockFile("mock:/tbl/part2/.hive-staging_dir", 500, new byte[0]),
> new MockFile("mock:/tbl/part1/_tmp_space.db", 500, new byte[0]),
> new MockFile("mock:/tbl/part1/delta_1_1/bucket--", 500, new 
> byte[0]));
> Path path = new MockPath(fs, "/tbl");
> Path stageDir = new MockPath(fs, "mock:/tbl/part1/.hive-staging_dir");
> FileSystem mockFs = spy(fs);
> Mockito.doThrow(new 
> FileNotFoundException("")).when(mockFs).listLocatedStatus(eq(stageDir));
> try {
>   Map hdfsDirSnapshots = 
> AcidUtils.getHdfsDirSnapshots(mockFs, path);
>   Assert.assertEquals(1, hdfsDirSnapshots.size());
> }
> catch (FileNotFoundException fnf) {
>   fail("Should not throw FileNotFoundException when a directory is 
> removed while fetching HDFSSnapshots");
> }
>   }{code}
> This issue got fixed as a part of HIVE-26481 but here its not fixed 
> completely. 
> [Here|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java#L1541]
>  FileUtils.listFiles() API which returns a RemoteIterator. 
> So while iterating over, it checks if it is a directory and recursive listing 
> then it will try to list files from that directory but if that directory is 
> removed by other thread/task then it throws FileNotFoundException. Here the 
> directory which got removed is the .staging directory which needs to be 
> excluded through by using passed filter.
>  
> So here we can use same logic written in 
> _org.apache.hadoop.hive.ql.io.AcidUtils#getHdfsDirSnapshotsForCleaner()_ API 
> to avoid FileNotFoundException.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27189) Remove duplicate info log in Hive.isSubDIr

2023-03-29 Thread shuyouZZ (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shuyouZZ updated HIVE-27189:

Description: 
In class {{{}org.apache.hadoop.hive.ql.metadata.HIve{}}}, invoke method 
{{isSubDir}} will print twice
{code:java}
LOG.debug("The source path is " + fullF1 + " and the destination path is " + 
fullF2);{code}
we should remove the duplicate info log.

> Remove duplicate info log in Hive.isSubDIr
> --
>
> Key: HIVE-27189
> URL: https://issues.apache.org/jira/browse/HIVE-27189
> Project: Hive
>  Issue Type: Improvement
>Reporter: shuyouZZ
>Priority: Major
>
> In class {{{}org.apache.hadoop.hive.ql.metadata.HIve{}}}, invoke method 
> {{isSubDir}} will print twice
> {code:java}
> LOG.debug("The source path is " + fullF1 + " and the destination path is " + 
> fullF2);{code}
> we should remove the duplicate info log.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-27033) Backport of HIVE-23044: Make sure Cleaner doesn't delete delta directories for running queries

2023-03-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27033?focusedWorklogId=853625=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853625
 ]

ASF GitHub Bot logged work on HIVE-27033:
-

Author: ASF GitHub Bot
Created on: 29/Mar/23 08:42
Start Date: 29/Mar/23 08:42
Worklog Time Spent: 10m 
  Work Description: amanraj2520 commented on PR #4027:
URL: https://github.com/apache/hive/pull/4027#issuecomment-1488176389

   @zabetak @abstractdog @vihangk1 Can you please approve and merge this. This 
already was present in Hive 3.1.3 release




Issue Time Tracking
---

Worklog Id: (was: 853625)
Time Spent: 0.5h  (was: 20m)

> Backport of HIVE-23044: Make sure Cleaner doesn't delete delta directories 
> for running queries
> --
>
> Key: HIVE-27033
> URL: https://issues.apache.org/jira/browse/HIVE-27033
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Aman Raj
>Assignee: Aman Raj
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-27145) Use StrictMath for remaining Math functions as followup of HIVE-23133

2023-03-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27145?focusedWorklogId=853598=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853598
 ]

ASF GitHub Bot logged work on HIVE-27145:
-

Author: ASF GitHub Bot
Created on: 29/Mar/23 07:40
Start Date: 29/Mar/23 07:40
Worklog Time Spent: 10m 
  Work Description: kasakrisz commented on PR #4122:
URL: https://github.com/apache/hive/pull/4122#issuecomment-1488093652

   @rbalamohan 
   IIUC `Math` implementation of functions affected by this PR may exploit 
specific microprocessor instructions if available in that platform which gives 
a better performance but the default implementation just calls the `StrictMath` 
version.
   The cost of the performance boost is precision.
   
   Example from the PR: 
   `Degrees(cdecimal1)`
   ```
   -6844.522849943508
   ```
   vs
   ```
   -6844.522849944  
   ```
   
   Should we give up possible performance benefits to favor precision? Could 
you please share your thoughts?




Issue Time Tracking
---

Worklog Id: (was: 853598)
Time Spent: 40m  (was: 0.5h)

> Use StrictMath for remaining Math functions as followup of HIVE-23133
> -
>
> Key: HIVE-27145
> URL: https://issues.apache.org/jira/browse/HIVE-27145
> Project: Hive
>  Issue Type: Task
>  Components: UDF
>Reporter: Himanshu Mishra
>Assignee: Himanshu Mishra
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> [HIVE-23133|https://issues.apache.org/jira/browse/HIVE-23133] started using 
> {{StrictMath}} for {{cos, exp, log}} UDFs to fix QTests failing as results 
> vary based on hardware when using Math library.
> Follow it up by using {{StrictMath}} for other Math functions that can have 
> same impact of underlying hardware namely, {{sin, tan, asin, acos, atan, 
> sqrt, pow, cbrt}}.
> [JDK-4477961|https://bugs.openjdk.org/browse/JDK-4477961] (in Java 9) changed 
> radians and degrees calculation leading to Q Test failures when tests are run 
> on Java 9+, fix such tests.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-27187) Incremental rebuild of materialized view having aggregate and stored by iceberg

2023-03-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27187?focusedWorklogId=853595=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853595
 ]

ASF GitHub Bot logged work on HIVE-27187:
-

Author: ASF GitHub Bot
Created on: 29/Mar/23 07:36
Start Date: 29/Mar/23 07:36
Worklog Time Spent: 10m 
  Work Description: sonarcloud[bot] commented on PR #4166:
URL: https://github.com/apache/hive/pull/4166#issuecomment-1488089545

   Kudos, SonarCloud Quality Gate passed!  [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive=4166)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=4166=false=BUG)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4166=false=BUG)
 [0 
Bugs](https://sonarcloud.io/project/issues?id=apache_hive=4166=false=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=4166=false=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4166=false=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=4166=false=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4166=false=SECURITY_HOTSPOT)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4166=false=SECURITY_HOTSPOT)
 [0 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4166=false=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive=4166=false=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4166=false=CODE_SMELL)
 [4 Code 
Smells](https://sonarcloud.io/project/issues?id=apache_hive=4166=false=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4166=coverage=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4166=duplicated_lines_density=list)
 No Duplication information
   
   




Issue Time Tracking
---

Worklog Id: (was: 853595)
Time Spent: 20m  (was: 10m)

> Incremental rebuild of materialized view having aggregate and stored by 
> iceberg
> ---
>
> Key: HIVE-27187
> URL: https://issues.apache.org/jira/browse/HIVE-27187
> Project: Hive
>  Issue Type: Improvement
>  Components: Iceberg integration, Materialized views
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently incremental rebuild of materialized view stored by iceberg which 
> definition query contains aggregate operator is transformed to an insert 
> overwrite statement which contains a union operator if the source tables 
> contains insert operations only. One branch of the union scans the view the 
> other produces the delta.
> This can be improved further: transform the statement to a multi insert 
> statement representing a merge statement to insert new aggregations and 
> update existing.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-27172) Add the HMS client connection timeout config

2023-03-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27172?focusedWorklogId=853592=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853592
 ]

ASF GitHub Bot logged work on HIVE-27172:
-

Author: ASF GitHub Bot
Created on: 29/Mar/23 07:33
Start Date: 29/Mar/23 07:33
Worklog Time Spent: 10m 
  Work Description: wecharyu commented on PR #4150:
URL: https://github.com/apache/hive/pull/4150#issuecomment-1488086581

   @ayushtkn @deniskuzZ @kasakrisz: Could you help review this PR?




Issue Time Tracking
---

Worklog Id: (was: 853592)
Time Spent: 1h 10m  (was: 1h)

> Add the HMS client connection timeout config
> 
>
> Key: HIVE-27172
> URL: https://issues.apache.org/jira/browse/HIVE-27172
> Project: Hive
>  Issue Type: Task
>  Components: Hive
>Reporter: Wechar
>Assignee: Wechar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Currently {{HiveMetastoreClient}} use {{CLIENT_SOCKET_TIMEOUT}} as both 
> socket timeout and connection timeout, it's not convenient for users to set a 
> smaller connection timeout.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26997) Iceberg: Vectorization gets disabled at runtime in merge-into statements

2023-03-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26997?focusedWorklogId=853569=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853569
 ]

ASF GitHub Bot logged work on HIVE-26997:
-

Author: ASF GitHub Bot
Created on: 29/Mar/23 06:28
Start Date: 29/Mar/23 06:28
Worklog Time Spent: 10m 
  Work Description: kasakrisz commented on code in PR #4162:
URL: https://github.com/apache/hive/pull/4162#discussion_r1151445817


##
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/IcebergAcidUtil.java:
##
@@ -93,10 +95,16 @@ public static Schema 
createFileReadSchemaWithVirtualColums(List 
dataCols) {
-List cols = 
Lists.newArrayListWithCapacity(dataCols.size() + SERDE_META_COLS.size());
+  public static Schema createSerdeSchemaForDelete(List 
dataCols, boolean partitioned,
+  Properties serDeProperties) {
+boolean skipRowData = 
Boolean.parseBoolean(serDeProperties.getProperty(WriterBuilder.ICEBERG_DELETE_SKIPROWDATA,
+WriterBuilder.ICEBERG_DELETE_SKIPROWDATA_DEFAULT));
+List cols = Lists.newArrayListWithCapacity(
+SERDE_META_COLS.size() + (skipRowData || partitioned ? 0 : 
dataCols.size()));

Review Comment:
   is it `skipRowData && !partitioned` ?



##
iceberg/iceberg-handler/src/test/queries/positive/vectorized_iceberg_merge_mixed.q:
##
@@ -0,0 +1,197 @@
+

Issue Time Tracking
---

Worklog Id: (was: 853569)
Time Spent: 1h 20m  (was: 1h 10m)

> Iceberg: Vectorization gets disabled at runtime in merge-into statements
> 
>
> Key: HIVE-26997
> URL: https://issues.apache.org/jira/browse/HIVE-26997
> Project: Hive
>  Issue Type: Improvement
>  Components: Iceberg integration
>Reporter: Rajesh Balamohan
>Assignee: Zsolt Miskolczi
>Priority: Major
>  Labels: pull-request-available
> Attachments: explain_merge_into.txt
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> *Query:*
> Think of "ssv" table as a table containing trickle feed data in the following 
> query. "store_sales_delete_1" is the destination table.
>  
> {noformat}
> MERGE INTO tpcds_1000_iceberg_mor_v4.store_sales_delete_1 t USING 
> tpcds_1000_update.ssv s ON (t.ss_item_sk = s.ss_item_sk
>                                                                               
>                 AND t.ss_customer_sk=s.ss_customer_sk
>                                                                               
>                 AND t.ss_sold_date_sk = "2451181"
>                                                                               
>                 AND ((Floor((s.ss_item_sk) / 1000) * 1000) BETWEEN 1000 AND 
> 2000)
>                                                                               
>                 AND s.ss_ext_discount_amt < 0.0) WHEN matched
> AND t.ss_ext_discount_amt IS NULL THEN
> UPDATE
> SET ss_ext_discount_amt = 0.0 WHEN NOT matched THEN
> INSERT (ss_sold_time_sk,
>         ss_item_sk,
>         ss_customer_sk,
>         ss_cdemo_sk,
>         ss_hdemo_sk,
>         ss_addr_sk,
>         ss_store_sk,
>         ss_promo_sk,
>         ss_ticket_number,
>         ss_quantity,
>         ss_wholesale_cost,
>         ss_list_price,
>         ss_sales_price,
>         ss_ext_discount_amt,
>         ss_ext_sales_price,
>         ss_ext_wholesale_cost,
>         ss_ext_list_price,
>         ss_ext_tax,
>         ss_coupon_amt,
>         ss_net_paid,
>         ss_net_paid_inc_tax,
>         ss_net_profit,
>         ss_sold_date_sk)
> VALUES (s.ss_sold_time_sk,
>         s.ss_item_sk,
>         s.ss_customer_sk,
>         s.ss_cdemo_sk,
>         s.ss_hdemo_sk,
>         s.ss_addr_sk,
>         s.ss_store_sk,
>         s.ss_promo_sk,
>         s.ss_ticket_number,
>         s.ss_quantity,
>         s.ss_wholesale_cost,
>         s.ss_list_price,
>         s.ss_sales_price,
>         s.ss_ext_discount_amt,
>         s.ss_ext_sales_price,
>         s.ss_ext_wholesale_cost,
>         s.ss_ext_list_price,
>         s.ss_ext_tax,
>         s.ss_coupon_amt,
>         s.ss_net_paid,
>         s.ss_net_paid_inc_tax,
>         s.ss_net_profit,
>         "2451181")
>  {noformat}
>  
>  
> *Issue:*
>  # Map phase is not getting vectorized due to "PARTITION_{_}SPEC{_}_ID" column
> {noformat}
> Map notVectorizedReason: Select expression for SELECT operator: Virtual 
> column PARTITION__SPEC__ID is not supported {noformat}
>  
> 2. "Reducer 2" stage isn't vectorized. 
> {noformat}
> Reduce notVectorizedReason: exception: java.lang.RuntimeException: Full Outer 
> Small Table Key Mapping duplicate column 0 in ordered column map {0=(value 
> column: 30, type info: int), 1=(value column: 31, type info: int)} when 
> adding value column 53, type 

[jira] [Work logged] (HIVE-26968) SharedWorkOptimizer merges TableScan operators that have different DPP parents

2023-03-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26968?focusedWorklogId=853567=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853567
 ]

ASF GitHub Bot logged work on HIVE-26968:
-

Author: ASF GitHub Bot
Created on: 29/Mar/23 06:24
Start Date: 29/Mar/23 06:24
Worklog Time Spent: 10m 
  Work Description: ngsg commented on PR #3981:
URL: https://github.com/apache/hive/pull/3981#issuecomment-1488012344

   Hello @zabetak. I have added a new qfile, which validates my PR. In a 
nutshell, this qfile submits the same query twice while varying the value of 
hive.optimize.shared.work.dppunion. I checked that current Hive produces 
different results as I described in the JIRA issue 
(https://issues.apache.org/jira/browse/HIVE-26968). Could you please review the 
changes?  Thank you.




Issue Time Tracking
---

Worklog Id: (was: 853567)
Time Spent: 40m  (was: 0.5h)

> SharedWorkOptimizer merges TableScan operators that have different DPP parents
> --
>
> Key: HIVE-26968
> URL: https://issues.apache.org/jira/browse/HIVE-26968
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 4.0.0-alpha-2
>Reporter: Seonggon Namgung
>Assignee: Seonggon Namgung
>Priority: Critical
>  Labels: hive-4.0.0-must, pull-request-available
> Attachments: TPC-DS Query64 OperatorGraph.pdf
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> SharedWorkOptimizer merges TableScan operators that have different DPP 
> parents, which leads to the creation of semantically wrong query plan.
> In our environment, running TPC-DS query64 on 1TB Iceberg format table 
> returns no rows  because of this problem. (The correct result has 7094 rows.)
> We use hive.optimize.shared.work=true, 
> hive.optimize.shared.work.extended=true, and 
> hive.optimize.shared.work.dppunion=false to reproduce the bug.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-27176) EXPLAIN SKEW

2023-03-29 Thread Yuming Wang (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17706227#comment-17706227
 ] 

Yuming Wang commented on HIVE-27176:


+1. Our internal Spark also supports similar feature: 
https://issues.apache.org/jira/browse/SPARK-35837

> EXPLAIN SKEW 
> 
>
> Key: HIVE-27176
> URL: https://issues.apache.org/jira/browse/HIVE-27176
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Priority: Major
>
> Thinking about a new explain feature, which is actually not an explain, 
> instead a set of analytical queries: considering a very complicated and large 
> SQL statement (this below is a simple one, just for example's sake):
> {code}
> SELECT a FROM (SELECT b ... JOIN c on b.x = c.y) d JOIN e ON d.v = e.w
> {code}
> EXPLAIN SKEW under the hood should run a query like:
> {code}
> SELECT "b", "x", x, count (distinct b.x) as count order by count desc limit 50
> UNION ALL
> SELECT "c", "y", y, count (distinct c.y) as count order by count desc limit 50
> UNION ALL
> SELECT "d", "v", v count (distinct d.v) as count order by count desc limit 50
> UNION ALL
> SELECT "e", "w", w, count (distinct e.w) as count order by count desc limit 50
> {code}
> collecting some cardinality info about all the join columns found in the 
> query, so result might be like:
> {code}
> table_name column_name column_value count
> b "x" x_skew_value1 100431234
> b "x" x_skew_value2 234
> c "y" y_skew_value1 35
> c "y" x_skew_value2 45
> c "y" x_skew_value3 42
> ...
> {code}
> this doesn't solve the problem, instead shows data skew immediately for 
> further analysis, also it doesn't suffer from incomplete stats problem, as it 
> really has to query data on the cluster
> +1 thing to check: reducer key is not always a join column, e.g. in case of 
> PTF
> maybe we should make a plan, and simply iterate on all reduce sink keys 
> instead of join columns
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)