Hello Bharath Vissapragada, I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/5743 to look at the new patch set (#2). Change subject: IMPALA-4789: Fix slow metadata loading due to inconsistent paths. ...................................................................... IMPALA-4789: Fix slow metadata loading due to inconsistent paths. The fix for IMPALA-4172/IMPALA-3653 introduced a performance regression for loading tables that have many partitions with: 1. Inconsistent HDFS path qualification or 2. A custom location (not under the table root dir) For the first issue consider a table whose root path is at 'hdfs://localhost:8020/warehouse/tbl/'. A partition with an unqualified location '/warehouse/tbl/p=1' will not be recognized as being a descendant of the table root dir by FileSystemUtil.isDescendentPath() because of how Path.equals() behaves, even if 'hdfs://localhost:8020' is the default filesystem. Such partitions are incorrectly recognized as having a custom location and are loaded separately. There were two performance issues: 1. The code for loading the files/blocks of partitions with truly custom locations was inefficient with an O(N^2) loop for determining the target partition. 2. Each partition that is incorrectly identified as having a custom path (e.g. due to inconsistent qualification), is going to have its files/blocks loaded twice. Once when the table root path is processed, and once when the 'custom' partition is processed. This patch fixes the detection of partitions with custom locations, and improves the speed of loading partitions with custom locations. Change-Id: I8c881b7cb155032b82fba0e29350ca31de388d55 --- M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java 2 files changed, 55 insertions(+), 15 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/5743/2 -- To view, visit http://gerrit.cloudera.org:8080/5743 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8c881b7cb155032b82fba0e29350ca31de388d55 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm <alex.b...@cloudera.com> Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com> Gerrit-Reviewer: Bharath Vissapragada <bhara...@cloudera.com> Gerrit-Reviewer: Dan Hecht <dhe...@cloudera.com> Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogian...@cloudera.com> Gerrit-Reviewer: Marcel Kornacker <mar...@cloudera.com>