List, Struggling with collecting metrics from the tserver. We are attempting to pull down per tablet, allowed_metrics = [ "rows_inserted", "rows_upserted", "rows_deleted", "scanner_rows_scanned", "upserts_as_updates", "rows_updated", "insertations_failed_dup_key" ] by querying the metrics json endpoint.
We see the metrics being sent and they appear to be the same value. I have the following questions: 1. Our metrics unique per `id` in the json payload ? 2. How do other collect metrics for their clusters? Any help would be appreciated thanks ! Our while loop looks like this: while not self.shutdown_event.is_set(): try: collection_time = time() http_response = requests.get("%s://localhost:%s/metrics" % ( self.protocol, self.port,), verify=False) for metric_type in http_response.json(): metric_prefix = metric_type['type'] for metric in metric_type['metrics']: if metric["name"] not in allowed_metrics: continue full_name = metric_prefix + "." + metric["name"] for key, value in metric.items(): if key == "name": continue log.info("%s_%s -> %s" % (full_name, key, value,)) try: point = float(value) tags = metric_type['attributes'].copy() tags['id'] = metric_type['id'] self.metrics_client.gauge( "%s_%s" % (full_name, key,), point, timestamp=collection_time, tags=tags) except ValueError as not_a_number: log.info("%s is not a number. Not sending", value) self.metrics_client.flush(timestamp=collection_time) except Exception as ex: log.error("Failed to parse kudu metrics", ex) log.info("Pausing for 10 seconds after processing metrics") self.shutdown_event.wait(10)