[Impala-ASF-CR] IMPALA-1670: Support multiple partitions in ALTER TABLE ADD PARTITION

2016-09-22 Thread Attila Jeges (Code Review)
Attila Jeges has posted comments on this change.

Change subject: IMPALA-1670: Support multiple partitions in ALTER TABLE ADD 
PARTITION
..


Patch Set 10:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/4144/10/fe/src/main/java/com/cloudera/impala/analysis/PartitionParams.java
File fe/src/main/java/com/cloudera/impala/analysis/PartitionParams.java:

PS10, Line 75: params.setLocation
> Why are you setting the location if it's null? isn't this an optional field
Done


PS10, Line 90: table_ instanceof HdfsTable)
> Is PartitionParams class applicable to other table types? If not, set a Pre
After going through the code once again, I realized that during the analysis, 
the Table object is available through the 'analyzer' object, therefore 
PartitionParams.setTable() is not needed at all.

The partitionSpec_.analyze() call ensures that the target table is an HdfsTable 
instance, so it is safe to remove the checks from PartitionParams.analyze().


http://gerrit.cloudera.org:8080/#/c/4144/10/fe/src/main/java/com/cloudera/impala/analysis/PartitionSpec.java
File fe/src/main/java/com/cloudera/impala/analysis/PartitionSpec.java:

PS10, Line 210: new ArrayList(partitionSpec_);
> = Lists.newArrayList(partitionSpec_)?
Done


http://gerrit.cloudera.org:8080/#/c/4144/10/fe/src/main/java/com/cloudera/impala/service/CatalogOpExecutor.java
File fe/src/main/java/com/cloudera/impala/service/CatalogOpExecutor.java:

PS10, Line 1640: for partitions not yet in catalog cache, add cache directives
> I don't think the comment is accurate. We shouldn't by default add cache di
Correct, the comment was fixed.


PS10, Line 1644: TAlterTableAddPartitionParams addPartParams)
   :   throws ImpalaException {
> nit: can these two be in the same line?
Done


PS10, Line 1651: hmsPartitions
> Maybe rename to "hmsPartitionsToAdd"?
Done


PS10, Line 1653: new HashMap();
> Maps.newHashMap()?
Done


PS10, Line 1657: partitionSpec)) {
> formatting nit: 2 more spaces are needed since this is part of a statement
Done


PS10, Line 1673: partitionCachingOpMap.put(hmsPartition.getValues(), cacheOp);
> Why do you populate this map with all the new partitions? Not all the parti
Correct, if cacheOp is null it doesn't have to be added to the map.

I'm not sure I understood the 2nd part of your comment correctly, please 
clarify it for me if I didn't:
If I create a list of Partitions that have a non-null cacheOp and pass that 
instead of 'hmsPartitions' to alterTableCachePartitions(), then I still need to 
map partitions to their corresponding cacheOps somehow. Otherwise I couldn't 
tell what cache pool name/replication to use when caching a partition. Also 
note that if the table is cached, its partitions should be cached as well when 
their cacheOp is null, so it is not safe to exclude them.

Here's what I did to simplify the code and get rid off partitionCachingOpMap:
Create a list of cacheOps for all the new partitions and pass that instead of 
'partitionCachingOpMap' to 'alterTableCachePartitions()' along with the list of 
new partitions. The list of cacheOps and the list of new partitions are of 
equal size and their elements can be paired up to get the partitions with their 
corresponding cacheOps.


PS10, Line 1689: hmsPartitions = getPartitionsFromHMS(msClient, tableName, 
hmsPartitions,
   : addedHmsPartitions);
> Do we always need to call this function? What if the size of addedHmsPartit
Correct, no need to make the call in that case.


-- 
To view, visit http://gerrit.cloudera.org:8080/4144
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iddbc951f2931f488f7048c9780260f6b49100750
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1670: Support multiple partitions in ALTER TABLE ADD PARTITION

2016-09-19 Thread Attila Jeges (Code Review)
Attila Jeges has uploaded a new patch set (#10).

Change subject: IMPALA-1670: Support multiple partitions in ALTER TABLE ADD 
PARTITION
..

IMPALA-1670: Support multiple partitions in ALTER TABLE ADD PARTITION

Just like Hive, Implala should support multiple partitions in ALTER
TABLE ADD PARTITION statements. The syntax is as follows:

ALTER TABLE table_name ADD [IF NOT EXISTS]
PARTITION partition_spec1 [location_spec1] [cache_spec1]
PARTITION partition_spec2 [location_spec2] [cache_spec2]
...

Grammar was modified to handle the new syntax. Introduced
PartitionParams class to capture the repeatable part of the statement.
TPartitionParams is the name of the corresponding thrift class.

AlterTableAddPartitionStmt and CatalogOpExecutor classes were also
modified to work with a list of partitions. Duplicate partition specs
are rejected in AlterTableAddPartitionStmt.analyze().

Added FE, E2E and integration tests.

Change-Id: Iddbc951f2931f488f7048c9780260f6b49100750
---
M common/thrift/JniCatalog.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/com/cloudera/impala/analysis/AlterTableAddPartitionStmt.java
A fe/src/main/java/com/cloudera/impala/analysis/PartitionParams.java
M fe/src/main/java/com/cloudera/impala/analysis/PartitionSpec.java
M fe/src/main/java/com/cloudera/impala/service/CatalogOpExecutor.java
M fe/src/test/java/com/cloudera/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/com/cloudera/impala/analysis/AuthorizationTest.java
M fe/src/test/java/com/cloudera/impala/analysis/ParserTest.java
M fe/src/test/java/com/cloudera/impala/analysis/ToSqlTest.java
M testdata/workloads/functional-query/queries/QueryTest/alter-table.test
M testdata/workloads/functional-query/queries/QueryTest/grant_revoke.test
M tests/common/impala_test_suite.py
M tests/metadata/test_hms_integration.py
M tests/metadata/test_partition_metadata.py
15 files changed, 716 insertions(+), 204 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/4144/10
-- 
To view, visit http://gerrit.cloudera.org:8080/4144
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iddbc951f2931f488f7048c9780260f6b49100750
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 


[Impala-ASF-CR] IMPALA-1670: Support multiple partitions in ALTER TABLE ADD PARTITION

2016-09-15 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-1670: Support multiple partitions in ALTER TABLE ADD 
PARTITION
..


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4144/8/fe/src/main/java/com/cloudera/impala/service/CatalogOpExecutor.java
File fe/src/main/java/com/cloudera/impala/service/CatalogOpExecutor.java:

PS8, Line 1766: for (Partition partition: hmsPartitions) {
  : // Create and add the HdfsPartition. Return the table 
object with an updated
  : // catalog version.
  : addHdfsPartition(tbl, partition);
  :   }
> Correct, this is exactly what happens. I've added an E2E test (test_add_ove
No, I don't think is the correct behavior. Actually I filed a JIRA 
(IMPALA-4141) about this case because it wasn't even working correctly for the 
single partition case. The right thing to do is to also add the partition in 
the catalog cache. If the user tries to add a partition he should get one of 
the two possible outcomes. Either an error is thrown and the partition is not 
added or no error and the partition is added.  No error and no partition added 
is weird from a usability point of view.


-- 
To view, visit http://gerrit.cloudera.org:8080/4144
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iddbc951f2931f488f7048c9780260f6b49100750
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-HasComments: Yes