Dimitris Tsirogiannis has posted comments on this change. ( http://gerrit.cloudera.org:8080/10291 )
Change subject: IMPALA-6948,IMPALA-6962: add end-to-end tests ...................................................................... Patch Set 2: (5 comments) http://gerrit.cloudera.org:8080/#/c/10291/2/tests/custom_cluster/test_metadata_replicas.py File tests/custom_cluster/test_metadata_replicas.py: http://gerrit.cloudera.org:8080/#/c/10291/2/tests/custom_cluster/test_metadata_replicas.py@39 PS2, Line 39: def test_load_all(self, cursor): : """ Loads metadata for all tables and validates that they're all the same.""" : # Use describe to issue table loads. : c_objects = self.cluster.catalogd.service.get_catalog_objects() : for obj in c_objects: : if obj[1] == "TABLE": cursor.execute("describe %s" % obj[0]) : # Use a sentinel ddl with sync_ddl to make sure that all impalads : # have receieved the effect of the previous ddl's. : cursor.execute("set sync_ddl=true") : cursor.execute("invalidate metadata functional.alltypes") : : self.__validate_metadata() This test is going to be very expensive and I am not sure how much extra coverage it adds compared to our existing metadata tests. Also, if we decide to keep it, you may want to consider using the load_catalog_in_background flag to trigger metadata load for all tables instead of running describe for each one of them. http://gerrit.cloudera.org:8080/#/c/10291/2/tests/custom_cluster/test_metadata_replicas.py@69 PS2, Line 69: cursor.execute("invalidate metadata") No need for the global one, you may do "invalidate metadata x". http://gerrit.cloudera.org:8080/#/c/10291/2/tests/custom_cluster/test_metadata_replicas.py@79 PS2, Line 79: wait_time_s = specific_build_type_timeout(10, slow_build_timeout=20) : sleep(wait_time_s) I agree with the TODO and I think we should fix it now. Essentially, in order to ensure that we are in steady state and that the coordinators have been through the cycle of detecting the catalog restart, asking the full update and receiving it, we need to compare the current value of the catalog version in the catalog and the last received catalog version in the coordinator (the latter is already a metric). http://gerrit.cloudera.org:8080/#/c/10291/2/tests/custom_cluster/test_metadata_replicas.py@81 PS2, Line 81: self.cluster.refresh() Hm, why is this needed? http://gerrit.cloudera.org:8080/#/c/10291/2/tests/custom_cluster/test_metadata_replicas.py@123 PS2, Line 123: def __dump_objects(self, catalog, impalads): : """ For debugging, prints all objects and their version.""" : print "CATALOG OBJECTS" : for k, v in catalog.items(): : print "%s, %s, %d" % (k, v[0], v[1]) : : for idx in xrange(0,len(impalads)): : print "IMPALAD %d OBJECTS" % idx : for k, v in impalads[idx].items(): : print "%s, %s, %d" % (k, v[0], v[1]) Is this used anywhere? -- To view, visit http://gerrit.cloudera.org:8080/10291 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5 Gerrit-Change-Number: 10291 Gerrit-PatchSet: 2 Gerrit-Owner: Vuk Ercegovac <vercego...@cloudera.com> Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com> Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogian...@cloudera.com> Gerrit-Reviewer: Vuk Ercegovac <vercego...@cloudera.com> Gerrit-Comment-Date: Mon, 07 May 2018 19:15:05 +0000 Gerrit-HasComments: Yes