zhangbutao commented on code in PR #5447:
URL: https://github.com/apache/hive/pull/5447#discussion_r1797634055
##########
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java:
##########
@@ -3221,11 +3221,11 @@ private List<Path>
dropPartitionsAndGetLocations(RawStore ms, String catName, St
for (String partName : partNames) {
String pathString = partitionLocations.get(partName);
if (pathString != null) {
- Path partPath = wh.getDnsPath(new Path(pathString));
// Double check here. Maybe Warehouse.getDnsPath revealed
relationship between the
// path objects
if (tableDnsPath == null ||
- !FileUtils.isSubdirectory(tableDnsPath, partPath.toString())) {
+ !FileUtils.isSubdirectory(tableDnsPath, pathString)) {
Review Comment:
After checking, i think your change has no meaning for the current
branch(Hive4 & master branch).
Because HIVE-19783 in Hive4 changed some codes logic about fetching
partition location. See
https://github.com/apache/hive/commit/e36f6e4fbda354f33ba9cef6cf25e5573c78d618#diff-cf3f64ef2e2a3a8ee2867032c2d3a1ac0b567a60f82b00299cc0dc99e00888a8
, HIVE-19783 did the optimization to only retrieve partitions locations
instead of all objects of partitions by `ObjectStore::getPartitionLocations`.
And this method only **return location string only outside the location**, see
comment
https://issues.apache.org/jira/browse/HIVE-19783?focusedCommentId=16512315&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16512315.
As a result, after HIVE-19783, if partition location is subdirectory of
table path, the variable `pathString` will always null, and if partition
location is different from table location, this code snippet will take effect.
I guess your parition location is subdirectory of table path. e.g. table
location = hdfs://ns1/db/testtbl, and partition location =
hdfs://ns1/db/testtbl/dt=2024. If so, after the optimization of HIVE-19783, `
if (pathString != null) ` will false, then you won't see any perfomance issue.
**But I guess you are using Hive3? You change may be useful for Hive3.**
BTW, it would be if you try to use this optimization HIVE-24838. And set the
metastore property as following:
`hive.blobstore.supported.schemes=hdfs,s3,s3a,s3n` . I think It can also
optimize some fs call in many places.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]