Optimize CombineHiveFileInputFormat execution speed ---------------------------------------------------
Key: HIVE-1149 URL: https://issues.apache.org/jira/browse/HIVE-1149 Project: Hadoop Hive Issue Type: Bug Reporter: Zheng Shao When there are a lot of files and a lot of pools, CombineHiveFileInputFormat is pretty slow. One of the culprit is the "new URI" call in the following function. We should try to get rid of it. {code} protected static PartitionDesc getPartitionDescFromPath( Map<String, PartitionDesc> pathToPartitionInfo, Path dir) throws IOException { // The format of the keys in pathToPartitionInfo sometimes contains a port // and sometimes doesn't, so we just compare paths. for (Map.Entry<String, PartitionDesc> entry : pathToPartitionInfo .entrySet()) { try { if (new URI(entry.getKey()).getPath().equals(dir.toUri().getPath())) { return entry.getValue(); } } catch (URISyntaxException e2) { } } throw new IOException("cannot find dir = " + dir.toString() + " in partToPartitionInfo!"); } {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.