Dimitris Tsirogiannis has posted comments on this change. ( http://gerrit.cloudera.org:8080/8202 )
Change subject: IMPALA-4704: Turns on client connections when local catalog initialized. ...................................................................... Patch Set 9: (6 comments) http://gerrit.cloudera.org:8080/#/c/8202/9/be/src/testutil/in-process-servers.cc File be/src/testutil/in-process-servers.cc: http://gerrit.cloudera.org:8080/#/c/8202/9/be/src/testutil/in-process-servers.cc@75 PS9, Line 75: if (started.ok()) return impala; : LOG(WARNING) << started.GetDetail(); : : delete impala; The changes in this file seem to alter the behavior a bit. In particular, previously even if SetCatalogInitialized would return an error, impala_server_ would still be initialized (see L107-108). With your change, if SetCatalogInitialized() throws an error, StartWithClientServers () will return with an error causing impala to be deleted. I don't recall how we use this in-process-server thing and your changes are most likely doing the right thing, but just wanted to point out the change in behavior. http://gerrit.cloudera.org:8080/#/c/8202/9/bin/start-impala-cluster.py File bin/start-impala-cluster.py: http://gerrit.cloudera.org:8080/#/c/8202/9/bin/start-impala-cluster.py@80 PS9, Line 80: parser.add_option("--disable_catalog", dest="disable_catalog", default=False, : action="store_true", : help="Starts all processes except catalogd.") This is used only for testing and using this doesn't result in a valid cluster configuration. So, I think it's best if we hide/remove this option. One option is to hide it using something like SUPPRESS_HELP/USAGE. Other option is to control this behavior using an env variable and not a startup option. Maybe the first one is not too bad. Thoughts? http://gerrit.cloudera.org:8080/#/c/8202/9/bin/start-impala-cluster.py@289 PS9, Line 289: impalad's catalog "catalog cache" http://gerrit.cloudera.org:8080/#/c/8202/9/bin/start-impala-cluster.py@304 PS9, Line 304: def wait_for_catalog(impalad, timeout_in_seconds): : """Waits for the impalad catalog to become ready""" : start_time = time() : catalog_ready = False : attempt = 0 : while (time() - start_time < timeout_in_seconds and not catalog_ready): : try: : num_dbs = impalad.service.get_metric_value('catalog.num-databases') : num_tbls = impalad.service.get_metric_value('catalog.num-tables') : catalog_ready = impalad.service.get_metric_value('catalog.ready') : if catalog_ready or attempt % 4 == 0: : print 'Waiting for Catalog... Status: %s DBs / %s tables (ready=%s)' %\ : (num_dbs, num_tbls, catalog_ready) : attempt += 1 : except Exception, e: : print e : sleep(0.5) : if not catalog_ready: : raise RuntimeError('Catalog was not initialized in expected time period.') : : def wait_for_client(impalad, timeout_in_seconds=CLUSTER_WAIT_TIMEOUT_IN_SECONDS): : """Waits for the client ports to become ready.""" : start_time = time() : client_beeswax = None : client_hs2 = None : while (time() - start_time < timeout_in_seconds): : try: : client_beeswax = impalad.service.create_beeswax_client() : client_hs2 = impalad.service.create_hs2_client() : break; : except Exception as e: : print 'Client services not ready: %s. Trying again ...' : finally: : if client_beeswax is not None: client_beeswax.close() : sleep(0.5) : : if client_beeswax is None or client_hs2 is None: : raise RuntimeError('Client port not openned in expected time period.') Does it make sense to merge these two functions into a single wait_for_catalog function? wait_for_client() is only used for checking if the catalog is ready because we can't rely on the 'catalog.ready' metric, no? So, if this thing is not useful why not remove it and check for the client connections instead? And then we can retrieve the num_dbs and num_tbls from the metrics. http://gerrit.cloudera.org:8080/#/c/8202/9/fe/src/main/java/org/apache/impala/service/Frontend.java File fe/src/main/java/org/apache/impala/service/Frontend.java: http://gerrit.cloudera.org:8080/#/c/8202/9/fe/src/main/java/org/apache/impala/service/Frontend.java@860 PS9, Line 860: state update http://gerrit.cloudera.org:8080/#/c/8202/9/fe/src/main/java/org/apache/impala/service/Frontend.java@904 PS9, Line 904: impaladCatalog_.get() getCatalog() -- To view, visit http://gerrit.cloudera.org:8080/8202 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I52b881cba18a7e4533e21a78751c2e35c3d4c8a6 Gerrit-Change-Number: 8202 Gerrit-PatchSet: 9 Gerrit-Owner: Vuk Ercegovac <vercego...@cloudera.com> Gerrit-Reviewer: Balazs Jeszenszky <jes...@gmail.com> Gerrit-Reviewer: Dan Hecht <dhe...@cloudera.com> Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogian...@cloudera.com> Gerrit-Reviewer: Philip Zeyliger <phi...@cloudera.com> Gerrit-Reviewer: Vuk Ercegovac <vercego...@cloudera.com> Gerrit-Comment-Date: Fri, 20 Oct 2017 18:11:42 +0000 Gerrit-HasComments: Yes