[Impala-ASF-CR] IMPALA-1670: Support multiple partitions in ALTER TABLE ADD PARTITION
Attila Jeges has posted comments on this change. Change subject: IMPALA-1670: Support multiple partitions in ALTER TABLE ADD PARTITION .. Patch Set 10: (10 comments) http://gerrit.cloudera.org:8080/#/c/4144/10/fe/src/main/java/com/cloudera/impala/analysis/PartitionParams.java File fe/src/main/java/com/cloudera/impala/analysis/PartitionParams.java: PS10, Line 75: params.setLocation > Why are you setting the location if it's null? isn't this an optional field Done PS10, Line 90: table_ instanceof HdfsTable) > Is PartitionParams class applicable to other table types? If not, set a Pre After going through the code once again, I realized that during the analysis, the Table object is available through the 'analyzer' object, therefore PartitionParams.setTable() is not needed at all. The partitionSpec_.analyze() call ensures that the target table is an HdfsTable instance, so it is safe to remove the checks from PartitionParams.analyze(). http://gerrit.cloudera.org:8080/#/c/4144/10/fe/src/main/java/com/cloudera/impala/analysis/PartitionSpec.java File fe/src/main/java/com/cloudera/impala/analysis/PartitionSpec.java: PS10, Line 210: new ArrayList(partitionSpec_); > = Lists.newArrayList(partitionSpec_)? Done http://gerrit.cloudera.org:8080/#/c/4144/10/fe/src/main/java/com/cloudera/impala/service/CatalogOpExecutor.java File fe/src/main/java/com/cloudera/impala/service/CatalogOpExecutor.java: PS10, Line 1640: for partitions not yet in catalog cache, add cache directives > I don't think the comment is accurate. We shouldn't by default add cache di Correct, the comment was fixed. PS10, Line 1644: TAlterTableAddPartitionParams addPartParams) : throws ImpalaException { > nit: can these two be in the same line? Done PS10, Line 1651: hmsPartitions > Maybe rename to "hmsPartitionsToAdd"? Done PS10, Line 1653: new HashMap(); > Maps.newHashMap()? Done PS10, Line 1657: partitionSpec)) { > formatting nit: 2 more spaces are needed since this is part of a statement Done PS10, Line 1673: partitionCachingOpMap.put(hmsPartition.getValues(), cacheOp); > Why do you populate this map with all the new partitions? Not all the parti Correct, if cacheOp is null it doesn't have to be added to the map. I'm not sure I understood the 2nd part of your comment correctly, please clarify it for me if I didn't: If I create a list of Partitions that have a non-null cacheOp and pass that instead of 'hmsPartitions' to alterTableCachePartitions(), then I still need to map partitions to their corresponding cacheOps somehow. Otherwise I couldn't tell what cache pool name/replication to use when caching a partition. Also note that if the table is cached, its partitions should be cached as well when their cacheOp is null, so it is not safe to exclude them. Here's what I did to simplify the code and get rid off partitionCachingOpMap: Create a list of cacheOps for all the new partitions and pass that instead of 'partitionCachingOpMap' to 'alterTableCachePartitions()' along with the list of new partitions. The list of cacheOps and the list of new partitions are of equal size and their elements can be paired up to get the partitions with their corresponding cacheOps. PS10, Line 1689: hmsPartitions = getPartitionsFromHMS(msClient, tableName, hmsPartitions, : addedHmsPartitions); > Do we always need to call this function? What if the size of addedHmsPartit Correct, no need to make the call in that case. -- To view, visit http://gerrit.cloudera.org:8080/4144 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iddbc951f2931f488f7048c9780260f6b49100750 Gerrit-PatchSet: 10 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Attila Jeges
Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1670: Support multiple partitions in ALTER TABLE ADD PARTITION
Attila Jeges has uploaded a new patch set (#10). Change subject: IMPALA-1670: Support multiple partitions in ALTER TABLE ADD PARTITION .. IMPALA-1670: Support multiple partitions in ALTER TABLE ADD PARTITION Just like Hive, Implala should support multiple partitions in ALTER TABLE ADD PARTITION statements. The syntax is as follows: ALTER TABLE table_name ADD [IF NOT EXISTS] PARTITION partition_spec1 [location_spec1] [cache_spec1] PARTITION partition_spec2 [location_spec2] [cache_spec2] ... Grammar was modified to handle the new syntax. Introduced PartitionParams class to capture the repeatable part of the statement. TPartitionParams is the name of the corresponding thrift class. AlterTableAddPartitionStmt and CatalogOpExecutor classes were also modified to work with a list of partitions. Duplicate partition specs are rejected in AlterTableAddPartitionStmt.analyze(). Added FE, E2E and integration tests. Change-Id: Iddbc951f2931f488f7048c9780260f6b49100750 --- M common/thrift/JniCatalog.thrift M fe/src/main/cup/sql-parser.cup M fe/src/main/java/com/cloudera/impala/analysis/AlterTableAddPartitionStmt.java A fe/src/main/java/com/cloudera/impala/analysis/PartitionParams.java M fe/src/main/java/com/cloudera/impala/analysis/PartitionSpec.java M fe/src/main/java/com/cloudera/impala/service/CatalogOpExecutor.java M fe/src/test/java/com/cloudera/impala/analysis/AnalyzeDDLTest.java M fe/src/test/java/com/cloudera/impala/analysis/AuthorizationTest.java M fe/src/test/java/com/cloudera/impala/analysis/ParserTest.java M fe/src/test/java/com/cloudera/impala/analysis/ToSqlTest.java M testdata/workloads/functional-query/queries/QueryTest/alter-table.test M testdata/workloads/functional-query/queries/QueryTest/grant_revoke.test M tests/common/impala_test_suite.py M tests/metadata/test_hms_integration.py M tests/metadata/test_partition_metadata.py 15 files changed, 716 insertions(+), 204 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/4144/10 -- To view, visit http://gerrit.cloudera.org:8080/4144 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iddbc951f2931f488f7048c9780260f6b49100750 Gerrit-PatchSet: 10 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Attila JegesGerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho
[Impala-ASF-CR] IMPALA-1670: Support multiple partitions in ALTER TABLE ADD PARTITION
Dimitris Tsirogiannis has posted comments on this change. Change subject: IMPALA-1670: Support multiple partitions in ALTER TABLE ADD PARTITION .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/4144/8/fe/src/main/java/com/cloudera/impala/service/CatalogOpExecutor.java File fe/src/main/java/com/cloudera/impala/service/CatalogOpExecutor.java: PS8, Line 1766: for (Partition partition: hmsPartitions) { : // Create and add the HdfsPartition. Return the table object with an updated : // catalog version. : addHdfsPartition(tbl, partition); : } > Correct, this is exactly what happens. I've added an E2E test (test_add_ove No, I don't think is the correct behavior. Actually I filed a JIRA (IMPALA-4141) about this case because it wasn't even working correctly for the single partition case. The right thing to do is to also add the partition in the catalog cache. If the user tries to add a partition he should get one of the two possible outcomes. Either an error is thrown and the partition is not added or no error and the partition is added. No error and no partition added is weird from a usability point of view. -- To view, visit http://gerrit.cloudera.org:8080/4144 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iddbc951f2931f488f7048c9780260f6b49100750 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Attila JegesGerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-HasComments: Yes