Todd Lipcon has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/10711 )
Change subject: IMPALA-7141 (part 1): clean up handling of default/dummy partition ...................................................................... IMPALA-7141 (part 1): clean up handling of default/dummy partition Currently, HdfsTable inconsistently uses the term "default partition" to refer to two different concepts: 1) For unpartitioned tables, a single partition with ID 1 and no partition keys is created and added to the partition map. 2) All tables have an additional partition added with partition ID -1 which acts as a sort of prototype for partition creation: when new partitions are created during an INSERT operation, the file format and other related options are copied out of this special partition. This partition is inconsistently referred to as either the "default partition" or the "dummy partition". The handling of this second case (the partition with id -1) was somewhat messy: - the partition shows up in the partitionMap_ member, but does not show up in the partitionIds_ member. - almost all of the call sites that iterate through the partitions of an HdfsTable instance ended up skipping over the dummy partition. - several call sites called getPartitions().size() but then had to adjust the result by subtracting one in order to actually count the number of partitions in a table. - similarly, test assertions had to assert that tables with 24 partitions had an expected partition map size of 25. In order to address the above, this patch makes the following changes: - getPartitions() and getPartitionMap() no longer include the dummy partition. This removes a bunch of special case checks to skip over the dummy partition or to adjust partition counts based on it. - to clarify the purpose of this partition, references to it are renamed to "prototype partition" instead of "default partition". - when converting the HdfsTable to/from Thrift, the prototype partition is included in its own field in the struct, instead of being stuffed into the same map with the true partitions of the table. This reflects the fact that this partition is special (eg missing fields like 'location' which otherwise are required for real partitions) This change should should be entirely internal with no functional differences. As such, the only testing changes are some fixes for assertions on the Thrift serialized structures and other internals. Change-Id: I15e91b50eb7c2a5e0bac8c33d603d6cd8cbaca2e Reviewed-on: http://gerrit.cloudera.org:8080/10711 Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Reviewed-by: Todd Lipcon <t...@apache.org> --- M be/src/exec/hdfs-table-sink.cc M be/src/exec/hdfs-table-sink.h M be/src/runtime/descriptors.cc M be/src/runtime/descriptors.h M common/thrift/CatalogObjects.thrift M common/thrift/ImpalaInternalService.thrift M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java M fe/src/main/java/org/apache/impala/analysis/LoadDataStmt.java M fe/src/main/java/org/apache/impala/catalog/FeFsPartition.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/catalog/CatalogObjectToFromThriftTest.java M fe/src/test/java/org/apache/impala/catalog/CatalogTest.java M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java M fe/src/test/java/org/apache/impala/testutil/BlockIdGenerator.java 18 files changed, 163 insertions(+), 205 deletions(-) Approvals: Impala Public Jenkins: Verified Todd Lipcon: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/10711 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I15e91b50eb7c2a5e0bac8c33d603d6cd8cbaca2e Gerrit-Change-Number: 10711 Gerrit-PatchSet: 6 Gerrit-Owner: Todd Lipcon <t...@apache.org> Gerrit-Reviewer: Gabor Kaszab <gaborkas...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Tianyi Wang <tw...@cloudera.com> Gerrit-Reviewer: Todd Lipcon <t...@apache.org> Gerrit-Reviewer: Vuk Ercegovac <vercego...@cloudera.com> Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>