Zach Amsden has posted comments on this change. Change subject: IMPALA-4762: RECOVER PARTITIONS should batch partition updates ......................................................................
Patch Set 1: (6 comments) http://gerrit.cloudera.org:8080/#/c/6275/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: PS1, Line 2641: // ifNotExists and needResults are true. > nit: move comment above L2644 fixed PS1, Line 2646: } catch (AlreadyExistsException e) { : throw new InternalException( : "AlreadyExistsException thrown although ifNotExists given", e); : } > I don't think we need to handle this case separately. Let the outer catch h killed it http://gerrit.cloudera.org:8080/#/c/6275/1/tests/metadata/test_recover_partitions.py File tests/metadata/test_recover_partitions.py: PS1, Line 171: test_recover_many_partitions > This test may be too expensive. How much time does it take to run? See my c ~30 seconds or so.. PS1, Line 182: # Preload 10% of the partitions with partition(i) = values(i) : for i in xrange(1, 700, 10): : self.execute_query_expect_success(self.client, : "INSERT INTO TABLE %s PARTITION(s='part%d') VALUES(%d)" % (FQ_TBL_NAME, i, i)) > I think we can live without this part. agreed. PS1, Line 193: self.filesystem_client.create_file(TBL_LOCATION + PART_DIR + FILE_PATH, : INSERTED_VALUE) > It should be sufficient to just create the directory, no? if I don't care about the values at all, this should be faster PS1, Line 196: result = self.execute_query_expect_success(self.client, : "SHOW PARTITIONS %s" % FQ_TBL_NAME) : for i in xrange(1, 700): : PART_DIR = "part%d\t" % i : if (i % 10 == 1): : assert self.has_value(PART_DIR, result.data) : else: : assert not self.has_value(PART_DIR, result.data) : : self.execute_query_expect_success(self.client, : "ALTER TABLE %s RECOVER PARTITIONS" % FQ_TBL_NAME) : result = self.execute_query_expect_success(self.client, : "select c from %s" % FQ_TBL_NAME) : for i in xrange(1, 700): : if (i % 10 == 1): : assert self.has_value(str(i), result.data) : else: : assert self.has_value(str(-i), result.data) > I would just do: a) show partitions and verify 0, b) recover partitions, c) Done -- To view, visit http://gerrit.cloudera.org:8080/6275 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7f9334051b11ba8fa16159b7ca67ddc7f2392733 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden <zams...@cloudera.com> Gerrit-Reviewer: Dan Hecht <dhe...@cloudera.com> Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogian...@cloudera.com> Gerrit-Reviewer: Zach Amsden <zams...@cloudera.com> Gerrit-HasComments: Yes