zhangbutao commented on code in PR #5447: URL: https://github.com/apache/hive/pull/5447#discussion_r1797634055
########## standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java: ########## @@ -3221,11 +3221,11 @@ private List<Path> dropPartitionsAndGetLocations(RawStore ms, String catName, St for (String partName : partNames) { String pathString = partitionLocations.get(partName); if (pathString != null) { - Path partPath = wh.getDnsPath(new Path(pathString)); // Double check here. Maybe Warehouse.getDnsPath revealed relationship between the // path objects if (tableDnsPath == null || - !FileUtils.isSubdirectory(tableDnsPath, partPath.toString())) { + !FileUtils.isSubdirectory(tableDnsPath, pathString)) { Review Comment: After checking, i think your change has no meaning for the current branch(Hive4 & master branch). Because HIVE-19783 in Hive4 changed some codes logic about fetching partition location. See https://github.com/apache/hive/commit/e36f6e4fbda354f33ba9cef6cf25e5573c78d618#diff-cf3f64ef2e2a3a8ee2867032c2d3a1ac0b567a60f82b00299cc0dc99e00888a8 , HIVE-19783 did the optimization to only retrieve partitions locations instead of all objects of partitions by `ObjectStore::getPartitionLocations`. And this method only **return location string only outside the location**, see comment https://issues.apache.org/jira/browse/HIVE-19783?focusedCommentId=16512315&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16512315. As a result, after HIVE-19783, if partition location is subdirectory of table path, the variable `pathString` will always null, and if partition location is different from table location, this code snippet will take effect. I guess your parition location is subdirectory of table path. e.g. table location = hdfs://ns1/db/testtbl, and partition location = hdfs://ns1/db/testtbl/dt=2024. If so, after the optimization of HIVE-19783, ` if (pathString != null) ` will false, then you won't see any perfomance issue. **But I guess you are using Hive3? You change may be useful for Hive3.** BTW, it would be if you try to use this optimization HIVE-24838. And set the metastore property as following: `hive.blobstore.supported.schemes=hdfs,s3,s3a,s3n` . I think It can also optimize some fs call in many places. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org