Hello Lars Volker, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/12282 to look at the new patch set (#6). Change subject: IMPALA-6050: Query profiles should indicate storage layer(s) used ...................................................................... IMPALA-6050: Query profiles should indicate storage layer(s) used This patch updates Impala explain plans so that the Scan Node section clearly displays which filesystems the Scan Node is reading data from (support has been added for scans from HDFS, S3, ADLS, and the local filesystem). Before this patch, if an Impala query scanned a table with partitions across different storage layers, the explain plan would look like this: PLAN-ROOT SINK | 01:EXCHANGE [UNPARTITIONED] | 00:SCAN HDFS [functional.alltypes] partitions=24/24 files=24 size=478.45KB Now the explain plan will look like this: PLAN-ROOT SINK | 01:EXCHANGE [UNPARTITIONED] | 00:SCAN S3 [functional.alltypes] ADLS partitions=4/24 files=4 size=478.45KB HDFS partitions=10/24 files=10 size=478.45KB S3 partitions=10/24 files=10 size=478.45KB The explain plan differentiates "SCAN HDFS" vs "SCAN S3" by using the root table path. This means that even scans of non-partitioned tables will see their explain plans change from "SCAN HDFS" to "SCAN [storage-layer-name]". This change affects explain plans that are stored on an single storage layer as well: 'partitions=...' will become 'HDFS partitions-...'. This patch makes several changes to PlannerTest.java so that by default test files do not validate the value of the storage layer displayed in the explain plan. This is necessary to support classes such as S3PlannerTest which run test files against S3. It makes several changes to impala_test_suite.py as well in order to support validation of explain plans in test files that run via Python. Specifically, it adds support for a new substitution variable in test files called $FILESYSTEM_NAME which is the name of the storage layer the test is being run against. Testing: * Ran core tests * Added new tests to PlannerTest * Added ExplainTest to allow for more fine-grained testing of explain plan logic Change-Id: I4b1b4a1bc1a24e9614e3b4dc5a61dc96d075d1c3 --- M fe/pom.xml M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java M fe/src/main/java/org/apache/impala/catalog/FeFsPartition.java M fe/src/main/java/org/apache/impala/catalog/FeFsTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/test/java/org/apache/impala/common/FrontendFixture.java A fe/src/test/java/org/apache/impala/planner/ExplainTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java M fe/src/test/java/org/apache/impala/testutil/TestUtils.java A testdata/workloads/functional-planner/queries/PlannerTest/scan-node-fs-scheme.test M testdata/workloads/functional-query/queries/QueryTest/partition-col-types.test M tests/common/impala_test_suite.py M tests/util/filesystem_utils.py 19 files changed, 624 insertions(+), 88 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/12282/6 -- To view, visit http://gerrit.cloudera.org:8080/12282 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I4b1b4a1bc1a24e9614e3b4dc5a61dc96d075d1c3 Gerrit-Change-Number: 12282 Gerrit-PatchSet: 6 Gerrit-Owner: Sahil Takiar <stak...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Lars Volker <l...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>