[jira] [Updated] (HIVE-25300) Fix hive conf items validator type
[ https://issues.apache.org/jira/browse/HIVE-25300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-25300: -- Labels: pull-request-available (was: ) > Fix hive conf items validator type > -- > > Key: HIVE-25300 > URL: https://issues.apache.org/jira/browse/HIVE-25300 > Project: Hive > Issue Type: Improvement >Reporter: Jeff Min >Priority: Minor > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Hive conf items should use RangeValidator > # hive.mv.files.thread > # hive.load.dynamic.partitions.thread > # hive.exec.input.listing.max.threads -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25300) Fix hive conf items validator type
[ https://issues.apache.org/jira/browse/HIVE-25300?focusedWorklogId=646534&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-646534 ] ASF GitHub Bot logged work on HIVE-25300: - Author: ASF GitHub Bot Created on: 04/Sep/21 00:09 Start Date: 04/Sep/21 00:09 Worklog Time Spent: 10m Work Description: github-actions[bot] commented on pull request #2439: URL: https://github.com/apache/hive/pull/2439#issuecomment-912871432 This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Feel free to reach out on the d...@hive.apache.org list if the patch is in need of reviews. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 646534) Remaining Estimate: 0h Time Spent: 10m > Fix hive conf items validator type > -- > > Key: HIVE-25300 > URL: https://issues.apache.org/jira/browse/HIVE-25300 > Project: Hive > Issue Type: Improvement >Reporter: Jeff Min >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > Hive conf items should use RangeValidator > # hive.mv.files.thread > # hive.load.dynamic.partitions.thread > # hive.exec.input.listing.max.threads -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25277) Slow Hive partition deletion for Cloud object stores with expensive ListFiles
[ https://issues.apache.org/jira/browse/HIVE-25277?focusedWorklogId=646433&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-646433 ] ASF GitHub Bot logged work on HIVE-25277: - Author: ASF GitHub Bot Created on: 03/Sep/21 18:39 Start Date: 03/Sep/21 18:39 Worklog Time Spent: 10m Work Description: coufon commented on a change in pull request #2421: URL: https://github.com/apache/hive/pull/2421#discussion_r702097544 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java ## @@ -5240,16 +5259,38 @@ public DropPartitionsResult drop_partitions_req( for (Path path : archToDelete) { wh.deleteDir(path, true, mustPurge, needsCm); } + +// Uses a priority queue to delete the parents of deleted directories if empty. +// The parent with the largest size is always processed first. It guarantees that +// the emptiness of a parent won't be changed once it has been processed. So duplicated +// processing can be avoided. +PriorityQueue parentsToDelete = new PriorityQueue<>(); for (PathAndPartValSize p : dirsToDelete) { wh.deleteDir(p.path, true, mustPurge, needsCm); + addParentForDel(parentsToDelete, p); +} + +HashSet processed = new HashSet<>(); +while (!parentsToDelete.isEmpty()) { try { -deleteParentRecursive(p.path.getParent(), p.partValSize - 1, mustPurge, needsCm); +PathAndPartValSize p = parentsToDelete.poll(); +if (processed.contains(p)) { + continue; +} +processed.add(p); + +Path path = p.path; +if (wh.isWritable(path) && wh.isDir(path) && wh.isEmptyDir(path)) { Review comment: wh.isEmptyDir uses listStatus that doesn't distinguish file and dir (at least for the GCS fs implementation: https://github.com/GoogleCloudDataproc/hadoop-connectors/blob/7825ab50c839aea43f1ff587b0e2803047af99bc/gcsio/src/main/java/com/google/cloud/hadoop/gcsio/GoogleCloudStorageFileSystem.java#L997). But I agree that isEmptyDir is enough no matter the path is a file or dir. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 646433) Time Spent: 3.5h (was: 3h 20m) > Slow Hive partition deletion for Cloud object stores with expensive ListFiles > - > > Key: HIVE-25277 > URL: https://issues.apache.org/jira/browse/HIVE-25277 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Affects Versions: All Versions >Reporter: Zhou Fang >Assignee: Zhou Fang >Priority: Major > Labels: pull-request-available > Time Spent: 3.5h > Remaining Estimate: 0h > > Deleting a Hive partition is slow when use a Cloud object store as the > warehouse for which ListFiles is expensive. A root cause is that the > recursive parent dir deletion is very inefficient: there are many duplicated > calls to isEmpty (ListFiles is called at the end). This fix sorts the parents > to delete according to the path size, and always processes the longest one > (e.g., a/b/c is always before a/b). As a result, each parent path is only > needed to be checked once. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25277) Slow Hive partition deletion for Cloud object stores with expensive ListFiles
[ https://issues.apache.org/jira/browse/HIVE-25277?focusedWorklogId=646431&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-646431 ] ASF GitHub Bot logged work on HIVE-25277: - Author: ASF GitHub Bot Created on: 03/Sep/21 18:25 Start Date: 03/Sep/21 18:25 Worklog Time Spent: 10m Work Description: coufon commented on a change in pull request #2421: URL: https://github.com/apache/hive/pull/2421#discussion_r702090099 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java ## @@ -5240,16 +5259,38 @@ public DropPartitionsResult drop_partitions_req( for (Path path : archToDelete) { wh.deleteDir(path, true, mustPurge, needsCm); } + +// Uses a priority queue to delete the parents of deleted directories if empty. +// The parent with the largest size is always processed first. It guarantees that Review comment: Done. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 646431) Time Spent: 3h 20m (was: 3h 10m) > Slow Hive partition deletion for Cloud object stores with expensive ListFiles > - > > Key: HIVE-25277 > URL: https://issues.apache.org/jira/browse/HIVE-25277 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Affects Versions: All Versions >Reporter: Zhou Fang >Assignee: Zhou Fang >Priority: Major > Labels: pull-request-available > Time Spent: 3h 20m > Remaining Estimate: 0h > > Deleting a Hive partition is slow when use a Cloud object store as the > warehouse for which ListFiles is expensive. A root cause is that the > recursive parent dir deletion is very inefficient: there are many duplicated > calls to isEmpty (ListFiles is called at the end). This fix sorts the parents > to delete according to the path size, and always processes the longest one > (e.g., a/b/c is always before a/b). As a result, each parent path is only > needed to be checked once. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25277) Slow Hive partition deletion for Cloud object stores with expensive ListFiles
[ https://issues.apache.org/jira/browse/HIVE-25277?focusedWorklogId=646429&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-646429 ] ASF GitHub Bot logged work on HIVE-25277: - Author: ASF GitHub Bot Created on: 03/Sep/21 18:20 Start Date: 03/Sep/21 18:20 Worklog Time Spent: 10m Work Description: coufon commented on a change in pull request #2421: URL: https://github.com/apache/hive/pull/2421#discussion_r702088091 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java ## @@ -5104,14 +5103,34 @@ public boolean drop_partition(final String db_name, final String tbl_name, null); } - private static class PathAndPartValSize { -PathAndPartValSize(Path path, int partValSize) { - this.path = path; - this.partValSize = partValSize; +/** Stores a path and its size. */ +private static class PathAndPartValSize implements Comparable { + + public Path path; + int partValSize; + + public PathAndPartValSize(Path path, int partValSize) { +this.path = path; +this.partValSize = partValSize; + } + + @Override + public boolean equals(Object o) { +if (o == this) { + return true; +} +if (!(o instanceof PathAndPartValSize)) { + return false; +} +return path.equals(((PathAndPartValSize) o).path); Review comment: Nice catch. It is a bug. The current code actually didn't correctly implement the HashSet, it just used the hashcode of the object but not (path, depth) pair. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 646429) Time Spent: 3h (was: 2h 50m) > Slow Hive partition deletion for Cloud object stores with expensive ListFiles > - > > Key: HIVE-25277 > URL: https://issues.apache.org/jira/browse/HIVE-25277 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Affects Versions: All Versions >Reporter: Zhou Fang >Assignee: Zhou Fang >Priority: Major > Labels: pull-request-available > Time Spent: 3h > Remaining Estimate: 0h > > Deleting a Hive partition is slow when use a Cloud object store as the > warehouse for which ListFiles is expensive. A root cause is that the > recursive parent dir deletion is very inefficient: there are many duplicated > calls to isEmpty (ListFiles is called at the end). This fix sorts the parents > to delete according to the path size, and always processes the longest one > (e.g., a/b/c is always before a/b). As a result, each parent path is only > needed to be checked once. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25277) Slow Hive partition deletion for Cloud object stores with expensive ListFiles
[ https://issues.apache.org/jira/browse/HIVE-25277?focusedWorklogId=646430&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-646430 ] ASF GitHub Bot logged work on HIVE-25277: - Author: ASF GitHub Bot Created on: 03/Sep/21 18:20 Start Date: 03/Sep/21 18:20 Worklog Time Spent: 10m Work Description: coufon commented on a change in pull request #2421: URL: https://github.com/apache/hive/pull/2421#discussion_r702088341 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java ## @@ -5104,14 +5103,34 @@ public boolean drop_partition(final String db_name, final String tbl_name, null); } - private static class PathAndPartValSize { -PathAndPartValSize(Path path, int partValSize) { - this.path = path; - this.partValSize = partValSize; +/** Stores a path and its size. */ +private static class PathAndPartValSize implements Comparable { + + public Path path; + int partValSize; + + public PathAndPartValSize(Path path, int partValSize) { +this.path = path; +this.partValSize = partValSize; + } + + @Override + public boolean equals(Object o) { +if (o == this) { + return true; +} +if (!(o instanceof PathAndPartValSize)) { + return false; +} +return path.equals(((PathAndPartValSize) o).path); + } + + /** The highest {@code partValSize} is processed first in a {@link PriorityQueue}. */ + @Override + public int compareTo(PathAndPartValSize o) { +return ((PathAndPartValSize) o).partValSize - partValSize; Review comment: Done. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 646430) Time Spent: 3h 10m (was: 3h) > Slow Hive partition deletion for Cloud object stores with expensive ListFiles > - > > Key: HIVE-25277 > URL: https://issues.apache.org/jira/browse/HIVE-25277 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Affects Versions: All Versions >Reporter: Zhou Fang >Assignee: Zhou Fang >Priority: Major > Labels: pull-request-available > Time Spent: 3h 10m > Remaining Estimate: 0h > > Deleting a Hive partition is slow when use a Cloud object store as the > warehouse for which ListFiles is expensive. A root cause is that the > recursive parent dir deletion is very inefficient: there are many duplicated > calls to isEmpty (ListFiles is called at the end). This fix sorts the parents > to delete according to the path size, and always processes the longest one > (e.g., a/b/c is always before a/b). As a result, each parent path is only > needed to be checked once. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25277) Slow Hive partition deletion for Cloud object stores with expensive ListFiles
[ https://issues.apache.org/jira/browse/HIVE-25277?focusedWorklogId=646425&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-646425 ] ASF GitHub Bot logged work on HIVE-25277: - Author: ASF GitHub Bot Created on: 03/Sep/21 18:09 Start Date: 03/Sep/21 18:09 Worklog Time Spent: 10m Work Description: coufon commented on a change in pull request #2421: URL: https://github.com/apache/hive/pull/2421#discussion_r702082867 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java ## @@ -5104,14 +5103,34 @@ public boolean drop_partition(final String db_name, final String tbl_name, null); } - private static class PathAndPartValSize { -PathAndPartValSize(Path path, int partValSize) { - this.path = path; - this.partValSize = partValSize; +/** Stores a path and its size. */ +private static class PathAndPartValSize implements Comparable { + + public Path path; + int partValSize; + + public PathAndPartValSize(Path path, int partValSize) { +this.path = path; +this.partValSize = partValSize; + } + + @Override + public boolean equals(Object o) { Review comment: Done. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 646425) Time Spent: 2h 50m (was: 2h 40m) > Slow Hive partition deletion for Cloud object stores with expensive ListFiles > - > > Key: HIVE-25277 > URL: https://issues.apache.org/jira/browse/HIVE-25277 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Affects Versions: All Versions >Reporter: Zhou Fang >Assignee: Zhou Fang >Priority: Major > Labels: pull-request-available > Time Spent: 2h 50m > Remaining Estimate: 0h > > Deleting a Hive partition is slow when use a Cloud object store as the > warehouse for which ListFiles is expensive. A root cause is that the > recursive parent dir deletion is very inefficient: there are many duplicated > calls to isEmpty (ListFiles is called at the end). This fix sorts the parents > to delete according to the path size, and always processes the longest one > (e.g., a/b/c is always before a/b). As a result, each parent path is only > needed to be checked once. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25277) Slow Hive partition deletion for Cloud object stores with expensive ListFiles
[ https://issues.apache.org/jira/browse/HIVE-25277?focusedWorklogId=646421&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-646421 ] ASF GitHub Bot logged work on HIVE-25277: - Author: ASF GitHub Bot Created on: 03/Sep/21 17:56 Start Date: 03/Sep/21 17:56 Worklog Time Spent: 10m Work Description: coufon commented on a change in pull request #2421: URL: https://github.com/apache/hive/pull/2421#discussion_r702075927 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java ## @@ -5104,14 +5103,34 @@ public boolean drop_partition(final String db_name, final String tbl_name, null); } - private static class PathAndPartValSize { -PathAndPartValSize(Path path, int partValSize) { - this.path = path; - this.partValSize = partValSize; +/** Stores a path and its size. */ +private static class PathAndPartValSize implements Comparable { + + public Path path; Review comment: Done -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 646421) Time Spent: 2h 40m (was: 2.5h) > Slow Hive partition deletion for Cloud object stores with expensive ListFiles > - > > Key: HIVE-25277 > URL: https://issues.apache.org/jira/browse/HIVE-25277 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Affects Versions: All Versions >Reporter: Zhou Fang >Assignee: Zhou Fang >Priority: Major > Labels: pull-request-available > Time Spent: 2h 40m > Remaining Estimate: 0h > > Deleting a Hive partition is slow when use a Cloud object store as the > warehouse for which ListFiles is expensive. A root cause is that the > recursive parent dir deletion is very inefficient: there are many duplicated > calls to isEmpty (ListFiles is called at the end). This fix sorts the parents > to delete according to the path size, and always processes the longest one > (e.g., a/b/c is always before a/b). As a result, each parent path is only > needed to be checked once. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25277) Slow Hive partition deletion for Cloud object stores with expensive ListFiles
[ https://issues.apache.org/jira/browse/HIVE-25277?focusedWorklogId=646420&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-646420 ] ASF GitHub Bot logged work on HIVE-25277: - Author: ASF GitHub Bot Created on: 03/Sep/21 17:55 Start Date: 03/Sep/21 17:55 Worklog Time Spent: 10m Work Description: coufon commented on a change in pull request #2421: URL: https://github.com/apache/hive/pull/2421#discussion_r702075034 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java ## @@ -5104,14 +5103,34 @@ public boolean drop_partition(final String db_name, final String tbl_name, null); } - private static class PathAndPartValSize { -PathAndPartValSize(Path path, int partValSize) { - this.path = path; - this.partValSize = partValSize; +/** Stores a path and its size. */ +private static class PathAndPartValSize implements Comparable { Review comment: Done. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 646420) Time Spent: 2.5h (was: 2h 20m) > Slow Hive partition deletion for Cloud object stores with expensive ListFiles > - > > Key: HIVE-25277 > URL: https://issues.apache.org/jira/browse/HIVE-25277 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Affects Versions: All Versions >Reporter: Zhou Fang >Assignee: Zhou Fang >Priority: Major > Labels: pull-request-available > Time Spent: 2.5h > Remaining Estimate: 0h > > Deleting a Hive partition is slow when use a Cloud object store as the > warehouse for which ListFiles is expensive. A root cause is that the > recursive parent dir deletion is very inefficient: there are many duplicated > calls to isEmpty (ListFiles is called at the end). This fix sorts the parents > to delete according to the path size, and always processes the longest one > (e.g., a/b/c is always before a/b). As a result, each parent path is only > needed to be checked once. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25277) Slow Hive partition deletion for Cloud object stores with expensive ListFiles
[ https://issues.apache.org/jira/browse/HIVE-25277?focusedWorklogId=646395&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-646395 ] ASF GitHub Bot logged work on HIVE-25277: - Author: ASF GitHub Bot Created on: 03/Sep/21 17:22 Start Date: 03/Sep/21 17:22 Worklog Time Spent: 10m Work Description: sunchao commented on a change in pull request #2421: URL: https://github.com/apache/hive/pull/2421#discussion_r702046193 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java ## @@ -5104,14 +5103,34 @@ public boolean drop_partition(final String db_name, final String tbl_name, null); } - private static class PathAndPartValSize { -PathAndPartValSize(Path path, int partValSize) { - this.path = path; - this.partValSize = partValSize; +/** Stores a path and its size. */ +private static class PathAndPartValSize implements Comparable { + + public Path path; + int partValSize; + + public PathAndPartValSize(Path path, int partValSize) { +this.path = path; +this.partValSize = partValSize; + } + + @Override + public boolean equals(Object o) { +if (o == this) { + return true; +} +if (!(o instanceof PathAndPartValSize)) { + return false; +} +return path.equals(((PathAndPartValSize) o).path); + } + + /** The highest {@code partValSize} is processed first in a {@link PriorityQueue}. */ + @Override + public int compareTo(PathAndPartValSize o) { +return ((PathAndPartValSize) o).partValSize - partValSize; Review comment: nit: the cast `(PathAndPartValSize) o` is unnecessary ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java ## @@ -5104,14 +5103,34 @@ public boolean drop_partition(final String db_name, final String tbl_name, null); } - private static class PathAndPartValSize { -PathAndPartValSize(Path path, int partValSize) { - this.path = path; - this.partValSize = partValSize; +/** Stores a path and its size. */ +private static class PathAndPartValSize implements Comparable { + + public Path path; + int partValSize; + + public PathAndPartValSize(Path path, int partValSize) { +this.path = path; +this.partValSize = partValSize; + } + + @Override + public boolean equals(Object o) { Review comment: we should also implement `hashCode` together with `equals`? ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java ## @@ -5240,16 +5259,38 @@ public DropPartitionsResult drop_partitions_req( for (Path path : archToDelete) { wh.deleteDir(path, true, mustPurge, needsCm); } + +// Uses a priority queue to delete the parents of deleted directories if empty. +// The parent with the largest size is always processed first. It guarantees that +// the emptiness of a parent won't be changed once it has been processed. So duplicated +// processing can be avoided. +PriorityQueue parentsToDelete = new PriorityQueue<>(); for (PathAndPartValSize p : dirsToDelete) { wh.deleteDir(p.path, true, mustPurge, needsCm); + addParentForDel(parentsToDelete, p); +} + +HashSet processed = new HashSet<>(); +while (!parentsToDelete.isEmpty()) { try { -deleteParentRecursive(p.path.getParent(), p.partValSize - 1, mustPurge, needsCm); +PathAndPartValSize p = parentsToDelete.poll(); +if (processed.contains(p)) { + continue; +} +processed.add(p); + +Path path = p.path; +if (wh.isWritable(path) && wh.isDir(path) && wh.isEmptyDir(path)) { Review comment: `wh.isDir(path) && wh.isEmptyDir(path)` seems duplicated? why we need `wh.isDir` if we already have `wh.isEmptyDir`? ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java ## @@ -5104,14 +5103,34 @@ public boolean drop_partition(final String db_name, final String tbl_name, null); } - private static class PathAndPartValSize { -PathAndPartValSize(Path path, int partValSize) { - this.path = path; - this.partValSize = partValSize; +/** Stores a path and its size. */ +private static class PathAndPartValSize implements Comparable { + + public Path path; + int partValSize; + + public PathAndPartValSize(Path path, int partValSize) { +this.path = path; +this.partValSize = partValSize; + } + + @Override +
[jira] [Updated] (HIVE-25495) Upgrade to JLine3
[ https://issues.apache.org/jira/browse/HIVE-25495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-25495: -- Labels: pull-request-available (was: ) > Upgrade to JLine3 > - > > Key: HIVE-25495 > URL: https://issues.apache.org/jira/browse/HIVE-25495 > Project: Hive > Issue Type: Improvement >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Jline 2 has been discontinued a long while ago. Hadoop uses JLine3 so Hive > should match. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25495) Upgrade to JLine3
[ https://issues.apache.org/jira/browse/HIVE-25495?focusedWorklogId=646317&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-646317 ] ASF GitHub Bot logged work on HIVE-25495: - Author: ASF GitHub Bot Created on: 03/Sep/21 14:05 Start Date: 03/Sep/21 14:05 Worklog Time Spent: 10m Work Description: belugabehr opened a new pull request #2617: URL: https://github.com/apache/hive/pull/2617 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 646317) Remaining Estimate: 0h Time Spent: 10m > Upgrade to JLine3 > - > > Key: HIVE-25495 > URL: https://issues.apache.org/jira/browse/HIVE-25495 > Project: Hive > Issue Type: Improvement >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Jline 2 has been discontinued a long while ago. Hadoop uses JLine3 so Hive > should match. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25498) Query with more than 32 count distinct functions returns wrong result
[ https://issues.apache.org/jira/browse/HIVE-25498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-25498: -- Labels: pull-request-available (was: ) > Query with more than 32 count distinct functions returns wrong result > - > > Key: HIVE-25498 > URL: https://issues.apache.org/jira/browse/HIVE-25498 > Project: Hive > Issue Type: Bug >Reporter: Robbie Zhang >Assignee: Robbie Zhang >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > If there are more than 32 "COUNT(DISTINCT COL)" functions in a query, all > these COUNT functions in this query return 0 instead of the proper values. > Here are the queries to reproduce this issue: > {code:java} > set hive.cbo.enable=true; > create table test_count (c0 string, c1 string, c2 string, c3 string, c4 > string, c5 string, c6 string, c7 string, c8 string, c9 string, c10 string, > c11 string, c12 string, c13 string, c14 string, c15 string, c16 string, c17 > string, c18 string, c19 string, c20 string, c21 string, c22 string, c23 > string, c24 string, c25 string, c26 string, c27 string, c28 string, c29 > string, c30 string, c31 string, c32 string); > INSERT INTO test_count values ('c0', 'c1', 'c2', 'c3', 'c4', 'c5', 'c6', > 'c7', 'c8', 'c9', 'c10', 'c11', 'c12', 'c13', 'c14', 'c15', 'c16', 'c17', > 'c18', 'c19', 'c20', 'c21', 'c22', 'c23', 'c24', 'c25', 'c26', 'c27', 'c28', > 'c29', 'c30', 'c31', 'c32'); > select count (distinct c0), count(distinct c1), count(distinct c2), > count(distinct c3), count(distinct c4), count(distinct c5), count(distinct > c6), count(distinct c7), count(distinct c8), count(distinct c9), > count(distinct c10), count(distinct c11), count(distinct c12), count(distinct > c13), count(distinct c14), count(distinct c15), count(distinct c16), > count(distinct c17), count(distinct c18), count(distinct c19), count(distinct > c20), count(distinct c21), count(distinct c22), count(distinct c23), > count(distinct c24), count(distinct c25), count(distinct c26), count(distinct > c27), count(distinct c28), count(distinct c29), count(distinct c30), > count(distinct c31), count(distinct c32) from test_count; > {code} > This bug is caused by HiveExpandDistinctAggregatesRule.getGroupingIdValue() > which uses int type. When there are more than 32 groupings the values > overflow. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25498) Query with more than 32 count distinct functions returns wrong result
[ https://issues.apache.org/jira/browse/HIVE-25498?focusedWorklogId=646309&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-646309 ] ASF GitHub Bot logged work on HIVE-25498: - Author: ASF GitHub Bot Created on: 03/Sep/21 13:50 Start Date: 03/Sep/21 13:50 Worklog Time Spent: 10m Work Description: ujc714 opened a new pull request #2616: URL: https://github.com/apache/hive/pull/2616 ### What changes were proposed in this pull request? Fix a bug in HiveExpandDistinctAggregatesRule.getGroupingIdValue() which causes "COUNT(DISTINCT COL)" function returns wrong result. ### Why are the changes needed? If there are more than 32 COUNT(DISTINCT COL)" function in a query, the values returned from HiveExpandDistinctAggregatesRule.getGroupingIdValue() overflow so these COUNT functions return 0. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? mvn test -Dtest=TestMiniTezCliDriver -Dqfile=multi_count_distinct.q -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 646309) Remaining Estimate: 0h Time Spent: 10m > Query with more than 32 count distinct functions returns wrong result > - > > Key: HIVE-25498 > URL: https://issues.apache.org/jira/browse/HIVE-25498 > Project: Hive > Issue Type: Bug >Reporter: Robbie Zhang >Assignee: Robbie Zhang >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > If there are more than 32 "COUNT(DISTINCT COL)" functions in a query, all > these COUNT functions in this query return 0 instead of the proper values. > Here are the queries to reproduce this issue: > {code:java} > set hive.cbo.enable=true; > create table test_count (c0 string, c1 string, c2 string, c3 string, c4 > string, c5 string, c6 string, c7 string, c8 string, c9 string, c10 string, > c11 string, c12 string, c13 string, c14 string, c15 string, c16 string, c17 > string, c18 string, c19 string, c20 string, c21 string, c22 string, c23 > string, c24 string, c25 string, c26 string, c27 string, c28 string, c29 > string, c30 string, c31 string, c32 string); > INSERT INTO test_count values ('c0', 'c1', 'c2', 'c3', 'c4', 'c5', 'c6', > 'c7', 'c8', 'c9', 'c10', 'c11', 'c12', 'c13', 'c14', 'c15', 'c16', 'c17', > 'c18', 'c19', 'c20', 'c21', 'c22', 'c23', 'c24', 'c25', 'c26', 'c27', 'c28', > 'c29', 'c30', 'c31', 'c32'); > select count (distinct c0), count(distinct c1), count(distinct c2), > count(distinct c3), count(distinct c4), count(distinct c5), count(distinct > c6), count(distinct c7), count(distinct c8), count(distinct c9), > count(distinct c10), count(distinct c11), count(distinct c12), count(distinct > c13), count(distinct c14), count(distinct c15), count(distinct c16), > count(distinct c17), count(distinct c18), count(distinct c19), count(distinct > c20), count(distinct c21), count(distinct c22), count(distinct c23), > count(distinct c24), count(distinct c25), count(distinct c26), count(distinct > c27), count(distinct c28), count(distinct c29), count(distinct c30), > count(distinct c31), count(distinct c32) from test_count; > {code} > This bug is caused by HiveExpandDistinctAggregatesRule.getGroupingIdValue() > which uses int type. When there are more than 32 groupings the values > overflow. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-25498) Query with more than 32 count distinct functions returns wrong result
[ https://issues.apache.org/jira/browse/HIVE-25498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robbie Zhang reassigned HIVE-25498: --- Assignee: Robbie Zhang > Query with more than 32 count distinct functions returns wrong result > - > > Key: HIVE-25498 > URL: https://issues.apache.org/jira/browse/HIVE-25498 > Project: Hive > Issue Type: Bug >Reporter: Robbie Zhang >Assignee: Robbie Zhang >Priority: Major > > If there are more than 32 "COUNT(DISTINCT COL)" functions in a query, all > these COUNT functions in this query return 0 instead of the proper values. > Here are the queries to reproduce this issue: > {code:java} > set hive.cbo.enable=true; > create table test_count (c0 string, c1 string, c2 string, c3 string, c4 > string, c5 string, c6 string, c7 string, c8 string, c9 string, c10 string, > c11 string, c12 string, c13 string, c14 string, c15 string, c16 string, c17 > string, c18 string, c19 string, c20 string, c21 string, c22 string, c23 > string, c24 string, c25 string, c26 string, c27 string, c28 string, c29 > string, c30 string, c31 string, c32 string); > INSERT INTO test_count values ('c0', 'c1', 'c2', 'c3', 'c4', 'c5', 'c6', > 'c7', 'c8', 'c9', 'c10', 'c11', 'c12', 'c13', 'c14', 'c15', 'c16', 'c17', > 'c18', 'c19', 'c20', 'c21', 'c22', 'c23', 'c24', 'c25', 'c26', 'c27', 'c28', > 'c29', 'c30', 'c31', 'c32'); > select count (distinct c0), count(distinct c1), count(distinct c2), > count(distinct c3), count(distinct c4), count(distinct c5), count(distinct > c6), count(distinct c7), count(distinct c8), count(distinct c9), > count(distinct c10), count(distinct c11), count(distinct c12), count(distinct > c13), count(distinct c14), count(distinct c15), count(distinct c16), > count(distinct c17), count(distinct c18), count(distinct c19), count(distinct > c20), count(distinct c21), count(distinct c22), count(distinct c23), > count(distinct c24), count(distinct c25), count(distinct c26), count(distinct > c27), count(distinct c28), count(distinct c29), count(distinct c30), > count(distinct c31), count(distinct c32) from test_count; > {code} > This bug is caused by HiveExpandDistinctAggregatesRule.getGroupingIdValue() > which uses int type. When there are more than 32 groupings the values > overflow. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-25496) hadoop 3.3.1 / hive 3.2.1 / OpenJDK11 compatible?
[ https://issues.apache.org/jira/browse/HIVE-25496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jerome Le Ray reassigned HIVE-25496: Assignee: (was: Jerome Le Ray) > hadoop 3.3.1 / hive 3.2.1 / OpenJDK11 compatible? > - > > Key: HIVE-25496 > URL: https://issues.apache.org/jira/browse/HIVE-25496 > Project: Hive > Issue Type: Bug > Environment: Linux VM >Reporter: Jerome Le Ray >Priority: Major > > We used the following configuration > hadoop 3.2.1 > hive 3.1.2 > PostGres 12 > Java - OracleJDK 8 > For internal reasons, we have to migrate to OpenJDK11. > So, I've migrated hadoop 3.2.1 to the new version hadoop 3.3.1 > When I'm starting the hiveserver2 service, I've got the error : > which: no hbase in > (/usr/local/bin:/bin:/usr/pgsql-12/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/opt/jdk-11.0.10+9/bin:/opt/hivemetastore/hadoop-3.3.1/bin:/opt/hivemetastore/apache-hive-3.1.2-bin/b > in) > 2021-09-02 16:48:05: Starting HiveServer2 > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/opt/hivemetastore/hadoop-3.3.1/share/hadoop/common/lib/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/opt/hivemetastore/apache-hive-3.1.2-bin/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > 2021-09-02 16:48:06,744 INFO conf.HiveConf: Found configuration file > file:/opt/hivemetastore/apache-hive-3.1.2-bin/conf/hive-site.xml > 2021-09-02 16:48:07,169 WARN conf.HiveConf: HiveConf of name > hive.metastore.local does not exist > 2021-09-02 16:48:07,169 WARN conf.HiveConf: HiveConf of name > hive.metastore.thrift.bind.host does not exist > 2021-09-02 16:48:07,170 WARN conf.HiveConf: HiveConf of name > hive.enforce.bucketing does not exist > 2021-09-02 16:48:08,414 INFO server.HiveServer2: STARTUP_MSG: > / > STARTUP_MSG: Starting HiveServer2 > STARTUP_MSG: host = lhroelcspt1001.enterprisenet.org/10.90.122.159 > STARTUP_MSG: args = [-hiveconf, mapred.job.tracker=local, -hiveconf, > fs.default.name=file:///cip-data, -hiveconf, > hive.metastore.warehouse.dir=file:cip-data, --hiveconf, hive.server2.thrif > t.port=1, --hiveconf, hive.root.logger=INFO,console] > STARTUP_MSG: version = 3.1.2 > (...) > STARTUP_MSG: build = git://HW13934/Users/gates/tmp/hive-branch-3.1/hive -r > 8190d2be7b7165effa62bd21b7d60ef81fb0e4af; compiled by 'gates' on Thu Aug 22 > 15:01:18 PDT 2019 > / > 2021-09-02 16:48:08,436 INFO server.HiveServer2: Starting HiveServer2 > 2021-09-02 16:48:08,462 WARN conf.HiveConf: HiveConf of name > hive.metastore.local does not exist > 2021-09-02 16:48:08,463 WARN conf.HiveConf: HiveConf of name > hive.metastore.thrift.bind.host does not exist > 2021-09-02 16:48:08,463 WARN conf.HiveConf: HiveConf of name > hive.enforce.bucketing does not exist > Hive Session ID = 440449ff-99b7-429c-82d9-e20bdcc9b46f > 2021-09-02 16:48:08,566 INFO SessionState: Hive Session ID = > 440449ff-99b7-429c-82d9-e20bdcc9b46f > 2021-09-02 16:48:08,566 INFO server.HiveServer2: Shutting down HiveServer2 > 2021-09-02 16:48:08,584 INFO server.HiveServer2: Stopping/Disconnecting tez > sessions. > 2021-09-02 16:48:08,585 WARN server.HiveServer2: Error starting HiveServer2 > on attempt 1, will retry in 6ms > java.lang.RuntimeException: Error applying authorization policy on hive > configuration: class jdk.internal.loader.ClassLoaders$AppClassLoader cannot > be cast to class java.net.URLClassLoader (jdk. > internal.loader.ClassLoaders$AppClassLoader and java.net.URLClassLoader are > in module java.base of loader 'bootstrap') > at org.apache.hive.service.cli.CLIService.init(CLIService.java:118) > at org.apache.hive.service.CompositeService.init(CompositeService.java:59) > at org.apache.hive.service.server.HiveServer2.init(HiveServer2.java:230) > at > org.apache.hive.service.server.HiveServer2.startHiveServer2(HiveServer2.java:1036) > at > org.apache.hive.service.server.HiveServer2.access$1600(HiveServer2.java:140) > at > org.apache.hive.service.server.HiveServer2$StartOptionExecutor.execute(HiveServer2.java:1305) > at org.apache.hive.service.server.HiveServer2.main(HiveServer2.java:1149) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Met
[jira] [Assigned] (HIVE-25496) hadoop 3.3.1 / hive 3.2.1 / OpenJDK11 compatible?
[ https://issues.apache.org/jira/browse/HIVE-25496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jerome Le Ray reassigned HIVE-25496: Assignee: Jerome Le Ray > hadoop 3.3.1 / hive 3.2.1 / OpenJDK11 compatible? > - > > Key: HIVE-25496 > URL: https://issues.apache.org/jira/browse/HIVE-25496 > Project: Hive > Issue Type: Bug > Environment: Linux VM >Reporter: Jerome Le Ray >Assignee: Jerome Le Ray >Priority: Major > > We used the following configuration > hadoop 3.2.1 > hive 3.1.2 > PostGres 12 > Java - OracleJDK 8 > For internal reasons, we have to migrate to OpenJDK11. > So, I've migrated hadoop 3.2.1 to the new version hadoop 3.3.1 > When I'm starting the hiveserver2 service, I've got the error : > which: no hbase in > (/usr/local/bin:/bin:/usr/pgsql-12/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/opt/jdk-11.0.10+9/bin:/opt/hivemetastore/hadoop-3.3.1/bin:/opt/hivemetastore/apache-hive-3.1.2-bin/b > in) > 2021-09-02 16:48:05: Starting HiveServer2 > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/opt/hivemetastore/hadoop-3.3.1/share/hadoop/common/lib/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/opt/hivemetastore/apache-hive-3.1.2-bin/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > 2021-09-02 16:48:06,744 INFO conf.HiveConf: Found configuration file > file:/opt/hivemetastore/apache-hive-3.1.2-bin/conf/hive-site.xml > 2021-09-02 16:48:07,169 WARN conf.HiveConf: HiveConf of name > hive.metastore.local does not exist > 2021-09-02 16:48:07,169 WARN conf.HiveConf: HiveConf of name > hive.metastore.thrift.bind.host does not exist > 2021-09-02 16:48:07,170 WARN conf.HiveConf: HiveConf of name > hive.enforce.bucketing does not exist > 2021-09-02 16:48:08,414 INFO server.HiveServer2: STARTUP_MSG: > / > STARTUP_MSG: Starting HiveServer2 > STARTUP_MSG: host = lhroelcspt1001.enterprisenet.org/10.90.122.159 > STARTUP_MSG: args = [-hiveconf, mapred.job.tracker=local, -hiveconf, > fs.default.name=file:///cip-data, -hiveconf, > hive.metastore.warehouse.dir=file:cip-data, --hiveconf, hive.server2.thrif > t.port=1, --hiveconf, hive.root.logger=INFO,console] > STARTUP_MSG: version = 3.1.2 > (...) > STARTUP_MSG: build = git://HW13934/Users/gates/tmp/hive-branch-3.1/hive -r > 8190d2be7b7165effa62bd21b7d60ef81fb0e4af; compiled by 'gates' on Thu Aug 22 > 15:01:18 PDT 2019 > / > 2021-09-02 16:48:08,436 INFO server.HiveServer2: Starting HiveServer2 > 2021-09-02 16:48:08,462 WARN conf.HiveConf: HiveConf of name > hive.metastore.local does not exist > 2021-09-02 16:48:08,463 WARN conf.HiveConf: HiveConf of name > hive.metastore.thrift.bind.host does not exist > 2021-09-02 16:48:08,463 WARN conf.HiveConf: HiveConf of name > hive.enforce.bucketing does not exist > Hive Session ID = 440449ff-99b7-429c-82d9-e20bdcc9b46f > 2021-09-02 16:48:08,566 INFO SessionState: Hive Session ID = > 440449ff-99b7-429c-82d9-e20bdcc9b46f > 2021-09-02 16:48:08,566 INFO server.HiveServer2: Shutting down HiveServer2 > 2021-09-02 16:48:08,584 INFO server.HiveServer2: Stopping/Disconnecting tez > sessions. > 2021-09-02 16:48:08,585 WARN server.HiveServer2: Error starting HiveServer2 > on attempt 1, will retry in 6ms > java.lang.RuntimeException: Error applying authorization policy on hive > configuration: class jdk.internal.loader.ClassLoaders$AppClassLoader cannot > be cast to class java.net.URLClassLoader (jdk. > internal.loader.ClassLoaders$AppClassLoader and java.net.URLClassLoader are > in module java.base of loader 'bootstrap') > at org.apache.hive.service.cli.CLIService.init(CLIService.java:118) > at org.apache.hive.service.CompositeService.init(CompositeService.java:59) > at org.apache.hive.service.server.HiveServer2.init(HiveServer2.java:230) > at > org.apache.hive.service.server.HiveServer2.startHiveServer2(HiveServer2.java:1036) > at > org.apache.hive.service.server.HiveServer2.access$1600(HiveServer2.java:140) > at > org.apache.hive.service.server.HiveServer2$StartOptionExecutor.execute(HiveServer2.java:1305) > at org.apache.hive.service.server.HiveServer2.main(HiveServer2.java:1149) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.
[jira] [Commented] (HIVE-25496) hadoop 3.3.1 / hive 3.2.1 / OpenJDK11 compatible?
[ https://issues.apache.org/jira/browse/HIVE-25496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17409365#comment-17409365 ] Jerome Le Ray commented on HIVE-25496: -- Hello, Below the versions used [hive@lhroelcspt1001 hiveserver2log]$ java --version openjdk 11.0.10 2021-01-19 OpenJDK Runtime Environment AdoptOpenJDK (build 11.0.10+9) OpenJDK 64-Bit Server VM AdoptOpenJDK (build 11.0.10+9, mixed mode) [hive@lhroelcspt1001 hiveserver2log]$ [hive@lhroelcspt1001 hiveserver2log]$ hadoop version Hadoop 3.3.1 Source code repository https://github.com/apache/hadoop.git -r a3b9c37a397ad4188041dd80621bdeefc46885f2 Compiled by ubuntu on 2021-06-15T05:13Z Compiled with protoc 3.7.1 >From source with checksum 88a4ddb2299aca054416d6b7f81ca55 This command was run using /opt/hivemetastore/hadoop-3.3.1/share/hadoop/common/hadoop-common-3.3.1.jar [hive@lhroelcspt1001 hiveserver2log]$ [hive@lhroelcspt1001 hiveserver2log]$ hive --version SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/hivemetastore/hadoop-3.3.1/share/hadoop/common/lib/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/hivemetastore/apache-hive-3.1.2-bin/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] Hive 3.1.2 Git git://HW13934/Users/gates/tmp/hive-branch-3.1/hive -r 8190d2be7b7165effa62bd21b7d60ef81fb0e4af Compiled by gates on Thu Aug 22 15:01:18 PDT 2019 >From source with checksum 0492c08f784b188c349f6afb1d8d9847 [hive@lhroelcspt1001 hiveserver2log]$ > hadoop 3.3.1 / hive 3.2.1 / OpenJDK11 compatible? > - > > Key: HIVE-25496 > URL: https://issues.apache.org/jira/browse/HIVE-25496 > Project: Hive > Issue Type: Bug > Environment: Linux VM >Reporter: Jerome Le Ray >Priority: Major > > We used the following configuration > hadoop 3.2.1 > hive 3.1.2 > PostGres 12 > Java - OracleJDK 8 > For internal reasons, we have to migrate to OpenJDK11. > So, I've migrated hadoop 3.2.1 to the new version hadoop 3.3.1 > When I'm starting the hiveserver2 service, I've got the error : > which: no hbase in > (/usr/local/bin:/bin:/usr/pgsql-12/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/opt/jdk-11.0.10+9/bin:/opt/hivemetastore/hadoop-3.3.1/bin:/opt/hivemetastore/apache-hive-3.1.2-bin/b > in) > 2021-09-02 16:48:05: Starting HiveServer2 > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/opt/hivemetastore/hadoop-3.3.1/share/hadoop/common/lib/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/opt/hivemetastore/apache-hive-3.1.2-bin/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > 2021-09-02 16:48:06,744 INFO conf.HiveConf: Found configuration file > file:/opt/hivemetastore/apache-hive-3.1.2-bin/conf/hive-site.xml > 2021-09-02 16:48:07,169 WARN conf.HiveConf: HiveConf of name > hive.metastore.local does not exist > 2021-09-02 16:48:07,169 WARN conf.HiveConf: HiveConf of name > hive.metastore.thrift.bind.host does not exist > 2021-09-02 16:48:07,170 WARN conf.HiveConf: HiveConf of name > hive.enforce.bucketing does not exist > 2021-09-02 16:48:08,414 INFO server.HiveServer2: STARTUP_MSG: > / > STARTUP_MSG: Starting HiveServer2 > STARTUP_MSG: host = lhroelcspt1001.enterprisenet.org/10.90.122.159 > STARTUP_MSG: args = [-hiveconf, mapred.job.tracker=local, -hiveconf, > fs.default.name=file:///cip-data, -hiveconf, > hive.metastore.warehouse.dir=file:cip-data, --hiveconf, hive.server2.thrif > t.port=1, --hiveconf, hive.root.logger=INFO,console] > STARTUP_MSG: version = 3.1.2 > (...) > STARTUP_MSG: build = git://HW13934/Users/gates/tmp/hive-branch-3.1/hive -r > 8190d2be7b7165effa62bd21b7d60ef81fb0e4af; compiled by 'gates' on Thu Aug 22 > 15:01:18 PDT 2019 > / > 2021-09-02 16:48:08,436 INFO server.HiveServer2: Starting HiveServer2 > 2021-09-02 16:48:08,462 WARN conf.HiveConf: HiveConf of name > hive.metastore.local does not exist > 2021-09-02 16:48:08,463 WARN conf.HiveConf: HiveConf of name > hive.metastore.thrift.bind.host does not exist > 2021-09-02 16:48:08,463 WARN conf.HiveConf: HiveConf of name > hive.enforce.bucketing does not exist > Hive Session ID = 440449ff-99b7-429c-82d9-e20bdcc9b46f > 2021-09-02 16:48:08,566 INFO SessionState: Hive Session ID = > 440449ff-99b7-429c-82d9-