Re: [PR] Metrics prefetcher script [beam]

via GitHub Fri, 16 Feb 2024 14:08:28 -0800


damccorm commented on code in PR #30327:
URL: https://github.com/apache/beam/pull/30327#discussion_r1493021576



##########
.test-infra/metrics/sync/github/github_runs_prefetcher/code/config.yaml:
##########
@@ -0,0 +1,331 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+categories:
+  - name: core_java
+    groupThreshold: 0.7
+    tests:
+    - "PreCommit SQL Java17"
+    - "PreCommit SQL Java11"
+    - "LoadTests Java GBK Smoke"
+    - "PreCommit Java Amazon-Web-Services IO Direct"
+    - "PreCommit Java Amqp IO Direct"
+    - "PreCommit Java Amazon-Web-Services2 IO Direct"
+    - "PreCommit Java"
+    - "PreCommit Java Cassandra IO Direct"
+    - "PreCommit Java Azure IO Direct"
+    - "PreCommit Java Cdap IO Direct"
+    - "PreCommit Java Clickhouse IO Direct"
+    - "PreCommit Java Csv IO Direct"
+    - "Java Tests"
+    - "PostCommit Java Avro Versions"
+    - "PreCommit Java Debezium IO Direct"
+    - "PreCommit Java File-schema-transform IO Direct"
+    - "PostCommit Java"
+    - "PreCommit Java GCP IO Direct"
+    - "PostCommit Java BigQueryEarlyRollout"
+    - "PreCommit Java Google-ads IO Direct"
+    - "PreCommit Java HBase IO Direct"
+    - "PreCommit Java ElasticSearch IO Direct"
+    - "PreCommit Java HCatalog IO Direct"
+    - "PreCommit Java Hadoop IO Direct"
+    - "PreCommit Java IOs Direct"
+    - "PostCommit Java Hadoop Versions"
+    - "PreCommit Java Jms IO Direct"
+    - "PostCommit Java ValidatesRunner Direct JavaVersions"
+    - "PreCommit Java Kafka IO Direct"
+    - "PostCommit Java Examples Direct"
+    - "PreCommit Java JDBC IO Direct"
+    - "PostCommit Java ValidatesRunner Samza"
+    - "PreCommit Java Mqtt IO Direct"
+    - "PreCommit Java Kinesis IO Direct"
+    - "PreCommit Java MongoDb IO Direct"
+    - "PostCommit Java IO Performance Tests"
+    - "PreCommit Java Kudu IO Direct"
+    - "PostCommit Java InfluxDbIO Integration Test"
+    - "PostCommit Java Jpms Direct Java21"
+    - "PostCommit Java ValidatesRunner Twister2"
+    - "PreCommit Java Neo4j IO Direct"
+    - "PostCommit Java Jpms Direct Java11"
+    - "PostCommit Javadoc"
+    - "PostCommit Java Jpms Direct Java17"
+    - "PreCommit Java Pulsar IO Direct"
+    - "PostCommit Java ValidatesRunner ULR"
+    - "PreCommit Java Parquet IO Direct"
+    - "PreCommit Java Redis IO Direct"
+    - "Java JMH"
+    - "PreCommit Java RabbitMq IO Direct"
+    - "PreCommit Java RequestResponse IO Direct"
+    - "PostCommit Java Nexmark Direct"
+    - "PreCommit Java Splunk IO Direct"
+    - "PreCommit Java Thrift IO Direct"
+    - "PreCommit Java Snowflake IO Direct"
+    - "PreCommit Java Solr IO Direct"
+    - "PostCommit Java PVR Samza"
+    - "PreCommit Java Tika IO Direct"
+    - "PostCommit Java SingleStoreIO IT"
+    - "PostCommit Java Sickbay"
+    - "PostCommit Java ValidatesRunner Direct"
+    - "PreCommit Java SingleStore IO Direct"
+    - "PreCommit Java InfluxDb IO Direct"
+    - "PreCommit Spotless"
+    - "PreCommit Kotlin Examples"
+  - name: dataflow_java
+    tests:
+    - "PostCommit XVR GoUsingJava Dataflow"
+    - "PostCommit XVR PythonUsingJavaSQL Dataflow"
+    - "PostCommit XVR JavaUsingPython Dataflow"
+    - "PostCommit XVR PythonUsingJava Dataflow"
+    - "PreCommit Java Examples Dataflow Java11"
+    - "PreCommit Java Examples Dataflow Java17"
+    - "PreCommit Java Examples Dataflow Java21"
+    - "PreCommit Java Examples Dataflow"
+    - "PostCommit Java ValidatesRunner Dataflow"
+    - "PostCommit Java Dataflow V1"
+    - "PostCommit Java ValidatesRunner Dataflow Streaming"
+    - "PostCommit Java Dataflow V2"
+    - "PostCommit Java ValidatesRunner Dataflow V2"
+    - "PostCommit Java Examples Dataflow"
+    - "PostCommit Java Examples Dataflow ARM"
+    - "PostCommit Java ValidatesRunner Dataflow V2 Streaming"
+    - "PostCommit Java ValidatesRunner Dataflow JavaVersions"
+    - "PostCommit Java Examples Dataflow Java"
+    - "PostCommit Java Examples Dataflow V2 Java"
+    - "PostCommit Java Jpms Dataflow Java11"
+    - "PostCommit Java Jpms Dataflow Java17"
+    - "PostCommit Java Nexmark Dataflow"
+    - "PostCommit Java Nexmark Dataflow V2"
+    - "PostCommit Java Nexmark Dataflow V2 Java"
+    - "PostCommit Java Tpcds Dataflow"
+    - "PostCommit Java Examples Dataflow V2"
+  - name: runners_java
+    tests:
+    - "PostCommit Java PVR Spark3 Streaming"
+    - "PostCommit Java ValidatesRunner Spark"
+    - "PostCommit Java Examples Spark"
+    - "PostCommit Java ValidatesRunner SparkStructuredStreaming"
+    - "PostCommit Java ValidatesRunner Spark Java11"
+    - "PostCommit Java PVR Spark Batch"
+    - "PreCommit Java Spark3 Versions"
+    - "PostCommit Java Tpcds Spark"
+    - "PostCommit Java Jpms Spark Java11"
+    - "PostCommit Java Nexmark Spark"
+    - "PostCommit Java Examples Flink"
+    - "PostCommit Java Tpcds Flink"
+    - "PostCommit Java PVR Flink Streaming"
+    - "PostCommit Java Jpms Flink Java11"
+    - "PreCommit Java PVR Flink Batch"
+    - "PostCommit Java Nexmark Flink"
+    - "PreCommit Java PVR Flink Docker"
+    - "PreCommit Java Flink Versions"
+    - "PostCommit Java ValidatesRunner Flink Java11"
+    - "PostCommit Java ValidatesRunner Flink"
+  - name: load_perf_java
+    tests:
+    - "LoadTests Java CoGBK Dataflow Batch"
+    - "LoadTests Java CoGBK Dataflow V2 Streaming JavaVersions"
+    - "LoadTests Java CoGBK Dataflow Streaming"
+    - "LoadTests Java Combine Dataflow Batch"
+    - "LoadTests Java Combine Dataflow Streaming"
+    - "LoadTests Java CoGBK Dataflow V2 Batch JavaVersions"
+    - "LoadTests Java GBK Dataflow Batch"
+    - "LoadTests Java GBK Dataflow Streaming"
+    - "LoadTests Java GBK Dataflow V2 Batch Java11"
+    - "LoadTests Java GBK Dataflow V2 Streaming Java11"
+    - "LoadTests Java GBK Dataflow V2 Batch Java17"
+    - "LoadTests Java GBK Dataflow V2 Streaming Java17"
+    - "LoadTests Java ParDo Dataflow Streaming"
+    - "LoadTests Java ParDo Dataflow V2 Streaming JavaVersions"
+    - "LoadTests Java ParDo Dataflow V2 Batch JavaVersions"
+    - "LoadTests Java ParDo Dataflow Batch"
+    - "LoadTests Java ParDo SparkStructuredStreaming Batch"
+    - "LoadTests Java CoGBK SparkStructuredStreaming Batch"
+    - "LoadTests Java Combine SparkStructuredStreaming Batch"
+    - "LoadTests Java GBK SparkStructuredStreaming Batch"
+    - "PerformanceTests BigQueryIO Batch Java Avro"
+    - "PerformanceTests BigQueryIO Streaming Java"
+    - "PerformanceTests BigQueryIO Batch Java Json"
+    - "PerformanceTests SQLBigQueryIO Batch Java"
+    - "PerformanceTests XmlIOIT"
+    - "PostCommit XVR Samza"
+    - "PerformanceTests ManyFiles TextIOIT"
+    - "PerformanceTests XmlIOIT HDFS"
+    - "PerformanceTests ParquetIOIT"
+    - "PerformanceTests ParquetIOIT HDFS"
+    - "PerformanceTests AvroIOIT"
+    - "PerformanceTests ManyFiles TextIOIT HDFS"
+    - "PerformanceTests TFRecordIOIT"
+    - "PerformanceTests Cdap"
+    - "PerformanceTests TextIOIT"
+    - "PerformanceTests AvroIOIT HDFS"
+    - "PerformanceTests SingleStoreIO"
+    - "PerformanceTests SparkReceiver IO"
+    - "PerformanceTests Compressed TextIOIT"
+    - "PerformanceTests TextIOIT HDFS"
+    - "PerformanceTests Compressed TextIOIT HDFS"
+    - "PerformanceTests HadoopFormat"
+    - "PerformanceTests JDBC"
+    - "PerformanceTests Kafka IO"
+  - name: core_python
+    tests:
+    - "Python Dependency Tests" 
+    - "PreCommit Python Dataframes" 
+    - "PreCommit Python Examples" 
+    - "PreCommit Python Integration" 
+    - "PostCommit Python ValidatesRunner Samza" 
+    - "LoadTests Python Smoke" 
+    - "Update Python Depedencies" 
+    - "PreCommit Python Runners" 
+    - "PreCommit Python Transforms" 
+    - "PostCommit Python Xlang Gcp Direct" 
+    - "Build python source distribution and wheels" 
+    - "Python tests" 
+    - "PostCommit Sickbay Python" 
+    - "PostCommit Python" 
+    - "PostCommit Python Arm" 
+    - "PostCommit Python Examples Direct" 
+    - "PreCommit Portable Python" 
+    - "PreCommit Python Coverage" 
+    - "PreCommit Python Docker" 
+    - "PreCommit Python" 
+    - "PostCommit Python MongoDBIO IT" 
+    - "PreCommit Python Docs" 
+    - "PreCommit Python Formatter" 
+    - "PostCommit Python Nexmark Direct" 
+    - "PreCommit Python Lint" 
+  - name: runners_python
+    tests:
+    - "PostCommit Python ValidatesRunner Dataflow" 
+    - "Python ValidatesContainer Dataflow ARM" 
+    - "PostCommit Python Xlang Gcp Dataflow" 
+    - "PostCommit Python Xlang IO Dataflow" 
+    - "PostCommit Python Examples Dataflow" 
+    - "PostCommit Python ValidatesContainer Dataflow" 
+    - "PostCommit Python ValidatesContainer Dataflow With RC" 
+    - "PostCommit Python ValidatesRunner Spark" 
+    - "PostCommit Python Examples Spark" 
+    - "PostCommit Python ValidatesRunner Flink" 
+    - "PreCommit Python PVR Flink" 
+    - "PostCommit Python Examples Flink" 
+  - name: load_perf_python
+    tests:
+    - "PerformanceTests xlang KafkaIO Python" 
+    - "LoadTests Python FnApiRunner Microbenchmark" 
+    - "PerformanceTests SpannerIO Write 2GB Python Batch" 
+    - "PerformanceTests SpannerIO Read 2GB Python" 
+    - "PerformanceTests BiqQueryIO Read Python" 
+    - "PerformanceTests BiqQueryIO Write Python Batch" 
+    - "PerformanceTests TextIOIT Python" 
+    - "PerformanceTests WordCountIT PythonVersions" 
+    - "Performance alerting tool on Python load/performance/benchmark tests." 
+    - "LoadTests Python SideInput Dataflow Batch" 
+    - "LoadTests Python CoGBK Dataflow Batch" 
+    - "LoadTests Python CoGBK Dataflow Streaming" 
+    - "LoadTests Python Combine Dataflow Batch" 
+    - "Inference Python Benchmarks Dataflow" 
+    - "LoadTests Python Combine Dataflow Streaming" 
+    - "LoadTests Python GBK Dataflow Batch" 
+    - "LoadTests Python GBK Dataflow Streaming" 
+    - "LoadTests Python GBK reiterate Dataflow Batch" 
+    - "LoadTests Python GBK reiterate Dataflow Streaming" 
+    - "LoadTests Python ParDo Dataflow Streaming" 
+    - "CloudML Benchmarks Dataflow" 
+    - "LoadTests Python ParDo Dataflow Batch" 
+    - "LoadTests Python CoGBK Flink Batch" 
+    - "LoadTests Python Combine Flink Batch" 
+    - "LoadTests Python Combine Flink Streaming" 
+    - "PerformanceTests PubsubIOIT Python Streaming" 
+    - "LoadTests Python ParDo Flink Batch" 
+    - "LoadTests Python ParDo Flink Streaming" 
+  - name: go
+    tests:
+    - "PerformanceTests MongoDBIO IT" 
+    - "PreCommit Go" 
+    - "PreCommit GoPortable" 
+    - "PreCommit GoPrism" 
+    - "PostCommit Go VR Samza" 
+    - "Go tests" 
+    - "PostCommit Go" 
+    - "PostCommit Go Dataflow ARM" 
+    - "LoadTests Go CoGBK Dataflow Batch" 
+    - "LoadTests Go Combine Dataflow Batch" 
+    - "LoadTests Go GBK Dataflow Batch" 
+    - "LoadTests Go ParDo Dataflow Batch" 
+    - "LoadTests Go SideInput Dataflow Batch" 
+    - "PostCommit Go VR Spark" 
+    - "PostCommit Go VR Flink" 
+    - "LoadTests Go CoGBK Flink Batch" 
+    - "LoadTests Go Combine Flink Batch" 
+    - "LoadTests Go GBK Flink Batch" 
+    - "LoadTests Go ParDo Flink Batch" 
+    - "LoadTests Go SideInput Flink Batch" 
+  - name: core_infra

Review Comment:
   This was actually the grouping that I wanted to have the higher threshold 
(not core_java), could you swap those? I actually think something like 0.8 or 
0.9 is appropriate here



##########
.test-infra/metrics/sync/github/sync_workflows.py:
##########
@@ -600,52 +319,86 @@ def append_workflow_runs(workflow, runs):
       else:
         number_of_runs_to_add =\
           int(GH_NUMBER_OF_WORKFLOW_RUNS_TO_FETCH) - len(workflow.runs)
-        workflow.runs.extend([(0, 'None', 'None')] * number_of_runs_to_add)
+        workflow.runs.extend(
+          [WorkflowRun(0, "None", "None", workflow.id, "None")] * 
number_of_runs_to_add
+        )
       if len(workflow.runs) >= int(GH_NUMBER_OF_WORKFLOW_RUNS_TO_FETCH):
           workflow_ids_to_fetch_extra_runs.pop(workflow_id, None)
       print(f"Successfully fetched extra workflow runs for: 
{workflow.filename}")
     page += 1
   print("Successfully fetched workflow runs details")
 
   for workflow in list(workflows.values()):
-    runs = sorted(workflow.runs, key=lambda r: r[0], reverse=True)
+    runs = sorted(workflow.runs, key=lambda r: r.id, reverse=True)
     workflow.runs = runs[:int(GH_NUMBER_OF_WORKFLOW_RUNS_TO_FETCH)]
 
   return list(workflows.values())
 
-def database_operations(connection, workflows):
+
+def save_workflows(workflows):
+  connection = init_db_connection()
   # Create the table and update it with the latest workflow runs
   if not workflows:
     return
   cursor = connection.cursor()
   workflows_table_name = "github_workflows"
+  workflow_runs_table_name = "github_workflow_runs"
   cursor.execute(f"DROP TABLE IF EXISTS {workflows_table_name};")
-  create_table_query = f"""
-  CREATE TABLE IF NOT EXISTS {workflows_table_name} (
-    workflow_id integer NOT NULL PRIMARY KEY,
-    job_name text NOT NULL,
-    job_yml_filename text NOT NULL,
-    dashboard_category text NOT NULL"""
-  for i in range(int(GH_NUMBER_OF_WORKFLOW_RUNS_TO_FETCH)):
-    create_table_query += f""",
-    run{i+1} text,
-    run{i+1}Id text"""
-  create_table_query += ")\n"
-  cursor.execute(create_table_query)
-  insert_query = f"INSERT INTO {workflows_table_name} VALUES "
+  cursor.execute(f"DROP TABLE IF EXISTS {workflow_runs_table_name};")
+  create_workflows_table_query = f"""
+    CREATE TABLE IF NOT EXISTS {workflows_table_name} (
+      workflow_id integer NOT NULL PRIMARY KEY,
+      name text NOT NULL,
+      filename text NOT NULL,
+      url text NOT NULL,
+      dashboard_category text NOT NULL,
+      threshold real NOT NULL)\n"""
+  create_workflow_runs_table_query = f"""
+    CREATE TABLE IF NOT EXISTS {workflow_runs_table_name} (
+      run_id text NOT NULL PRIMARY KEY,
+      status text NOT NULL,
+      url text NOT NULL,
+      workflow_id integer NOT NULL FOREIGN KEY,
+      started_at timestamp with time zone NOT NULL)\n"""
+  cursor.execute(create_workflows_table_query)
+  cursor.execute(create_workflow_runs_table_query)
+  insert_workflows_query = f"""
+    INSERT INTO {workflows_table_name} (workflow_id, name, filename, url, 
dashboard_category, threshold)
+    VALUES %s"""
+  insert_workflow_runs_query = f"""
+    INSERT INTO {workflow_runs_table_name} (run_id, status, url, workflow_id, 
started_at)
+    VALUES %s"""
+  insert_workflows = []
+  insert_workflow_runs = []
   for workflow in workflows:
-    category = get_dashboard_category(workflow.name)
-    row_insert =\
-      
f"(\'{workflow.id}\',\'{workflow.name}\',\'{workflow.filename}\',\'{category}\'"
-    for _, status, url in workflow.runs:
-      row_insert += f",\'{status}\',\'{url}\'"
-    insert_query += f"{row_insert}),"
-  insert_query = insert_query[:-1] + ";"
-  print(insert_query)
-  cursor.execute(insert_query)
+    insert_workflows.append(
+      (
+        workflow.id,
+        workflow.name,
+        workflow.filename,
+        workflow.url,
+        workflow.category,
+        workflow.threshold
+      )
+    )
+    for run in workflow.runs:
+      if run.id != 0:
+        started_at = run.started_at.replace("T", " ")
+        insert_workflow_runs.append(
+          (
+            run.id,
+            run.status,
+            run.url,
+            workflow.id,
+            started_at
+          )
+        )
+  psycopg2.extras.execute_values(cursor, insert_workflows_query, 
insert_workflows)
+  psycopg2.extras.execute_values(cursor, insert_workflow_runs_query, 
insert_workflow_runs)
   cursor.close()
   connection.commit()
   connection.close()

Review Comment:
   Can't remember if we discussed this, but do we still need this now that we 
have the separate fetcher? Can't we just populate everything at once?



##########
.test-infra/metrics/sync/github/github_runs_prefetcher/code/config.yaml:
##########
@@ -0,0 +1,331 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+categories:

Review Comment:
   Thanks for moving this to config, wanted to do this and didn't have the time 
to go back and do the cleanup!



##########
.test-infra/metrics/grafana/dashboards/GA-Post-Commits_status_dashboard.json:
##########
@@ -935,7 +935,7 @@
           "hide": false,
           "metricColumn": "none",
           "rawQuery": true,
-          "rawSql": "with workflows as (\n  select\n    *,\n    case when run1 
like 'success' then 1 when run1 like 'in_progress' then 2 when run1 like 
'queued' then 2 when run1 like 'waiting' then 2 when run1 like 'cancelled' then 
3 when run1 like 'None' then 4 else 0 end as run_1,\n    case when run2 like 
'success' then 1 when run2 like 'in_progress' then 2 when run2 like 'queued' 
then 2 when run2 like 'waiting' then 2 when run2 like 'cancelled' then 3 when 
run2 like 'None' then 4 else 0 end as run_2,\n    case when run3 like 'success' 
then 1 when run3 like 'in_progress' then 2 when run3 like 'queued' then 2 when 
run3 like 'waiting' then 2 when run3 like 'cancelled' then 3 when run3 like 
'None' then 4 else 0 end as run_3,\n    case when run4 like 'success' then 1 
when run4 like 'in_progress' then 2 when run4 like 'queued' then 2 when run4 
like 'waiting' then 2 when run4 like 'cancelled' then 3 when run4 like 'None' 
then 4 else 0 end as run_4,\n    case when run5 like 'success' the
 n 1 when run5 like 'in_progress' then 2 when run5 like 'queued' then 2 when 
run5 like 'waiting' then 2 when run5 like 'cancelled' then 3 when run5 like 
'None' then 4 else 0 end as run_5,\n    case when run6 like 'success' then 1 
when run6 like 'in_progress' then 2 when run6 like 'queued' then 2 when run6 
like 'waiting' then 2 when run6 like 'cancelled' then 3 when run6 like 'None' 
then 4 else 0 end as run_6,\n    case when run7 like 'success' then 1 when run7 
like 'in_progress' then 2 when run7 like 'queued' then 2 when run7 like 
'waiting' then 2 when run7 like 'cancelled' then 3 when run7 like 'None' then 4 
else 0 end as run_7,\n    case when run8 like 'success' then 1 when run8 like 
'in_progress' then 2 when run8 like 'queued' then 2 when run8 like 'waiting' 
then 2 when run8 like 'cancelled' then 3 when run8 like 'None' then 4 else 0 
end as run_8,\n    case when run9 like 'success' then 1 when run9 like 
'in_progress' then 2 when run9 like 'queued' then 2 when run9 like 'waiting' t
 hen 2 when run9 like 'cancelled' then 3 when run9 like 'None' then 4 else 0 
end as run_9,\n    case when run10 like 'success' then 1 when run10 like 
'in_progress' then 2 when run10 like 'queued' then 2 when run10 like 'waiting' 
then 2 when run10 like 'cancelled' then 3 when run10 like 'None' then 4 else 0 
end as run_10\n  from\n    github_workflows\n  where\n    dashboard_category = 
'core_infra'\n)\nselect\n  job_name,\n  job_yml_filename,\n  run_1,\n  
run1Id,\n  run_2,\n  run2Id,\n  run_3,\n  run3Id,\n  run_4,\n  run4Id,\n  
run_5,\n  run5Id,\n  run_6,\n  run6Id,\n  run_7,\n  run7Id,\n  run_8,\n  
run8Id,\n  run_9,\n  run9Id,\n  run_10,\n  run10Id\nfrom\n  workflows;",
+          "rawSql": "with workflows as (\n\twith temp as (SELECT 
t1.workflow_id, t1.job_name, t1.job_yml_filename, t1.dashboard_category, 
t1.run1, t2.run1id ,\nt1.run2, t2.run2id ,\nt1.run3, t2.run3id ,\nt1.run4, 
t2.run4id ,\nt1.run5, t2.run5id ,\nt1.run6, t2.run6id ,\nt1.run7, t2.run7id 
,\nt1.run8, t2.run8id ,\nt1.run9, t2.run9id ,\nt1.run10, t2.run10id \nFROM 
(SELECT * FROM crosstab('SELECT github_workflows_test.workflow_id, name, 
filename, dashboard_category, run_number, status \n                             
 FROM github_workflow_runs_test\nINNER JOIN github_workflows_test ON 
github_workflow_runs_test.workflow_id = 
github_workflows_test.workflow_id\nORDER BY 1,5 DESC\n','SELECT m from 
generate_series(1,10) m')\nAS c1(workflow_id text, job_name text, 
job_yml_filename text, dashboard_category text, run1 text, run2 text,run3 text, 
\n         run4 text ,run5 text,run6 text, run7 text, \n         run8 text 
,run9 text,run10 text) \n\t ) AS t1\n         JOIN (SELECT * FROM crosstab('SEL
 ECT name, status, github_workflow_runs_test.url AS run_url\nFROM 
github_workflow_runs_test\nINNER JOIN github_workflows_test ON 
github_workflow_runs_test.workflow_id = 
github_workflows_test.workflow_id\nORDER BY 1,3 DESC\n\n')\nAS c2(job_name 
text, run1id text, run2id text,run3id text, \n         run4id text ,run5id 
text,run6id text, run7id text, \n         run8id text ,run9id text,run10id 
text) \n\t\t\t  ) AS t2\n        ON t1.job_name = t2.job_name\n\t\t )\n  
select\n    *,\n    case when run1 like 'success' then 1 when run1 like 
'in_progress' then 2 when run1 like 'queued' then 2 when run1 like 'waiting' 
then 2 when run1 like 'cancelled' then 3 when run1 like 'failure' then 0 else 4 
end as run_1,\n    case when run2 like 'success' then 1 when run2 like 
'in_progress' then 2 when run2 like 'queued' then 2 when run2 like 'waiting' 
then 2 when run2 like 'cancelled' then 3 when run2 like 'failure' then 0 else 4 
end as run_2,\n    case when run3 like 'success' then 1 when run3 like 'in
 _progress' then 2 when run3 like 'queued' then 2 when run3 like 'waiting' then 
2 when run3 like 'cancelled' then 3 when run3 like 'failure' then 0 else 4 end 
as run_3,\n    case when run4 like 'success' then 1 when run4 like 
'in_progress' then 2 when run4 like 'queued' then 2 when run4 like 'waiting' 
then 2 when run4 like 'cancelled' then 3 when run4 like 'failure' then 0 else 4 
end as run_4,\n    case when run5 like 'success' then 1 when run5 like 
'in_progress' then 2 when run5 like 'queued' then 2 when run5 like 'waiting' 
then 2 when run5 like 'cancelled' then 3 when run5 like 'failure' then 0 else 4 
end as run_5,\n    case when run6 like 'success' then 1 when run6 like 
'in_progress' then 2 when run6 like 'queued' then 2 when run6 like 'waiting' 
then 2 when run6 like 'cancelled' then 3 when run6 like 'failure' then 0 else 4 
end as run_6,\n    case when run7 like 'success' then 1 when run7 like 
'in_progress' then 2 when run7 like 'queued' then 2 when run7 like 'waiting' 
then 2 when
  run7 like 'cancelled' then 3 when run7 like 'failure' then 0 else 4 end as 
run_7,\n    case when run8 like 'success' then 1 when run8 like 'in_progress' 
then 2 when run8 like 'queued' then 2 when run8 like 'waiting' then 2 when run8 
like 'cancelled' then 3 when run8 like 'failure' then 0 else 4 end as run_8,\n  
  case when run9 like 'success' then 1 when run9 like 'in_progress' then 2 when 
run9 like 'queued' then 2 when run9 like 'waiting' then 2 when run9 like 
'cancelled' then 3 when run9 like 'failure' then 0 else 4 end as run_9,\n    
case when run10 like 'success' then 1 when run10 like 'in_progress' then 2 when 
run10 like 'queued' then 2 when run10 like 'waiting' then 2 when run10 like 
'cancelled' then 3 when run10 like 'failure' then 0 else 4 end as run_10\n  
from\n    temp\n  where\n    dashboard_category = 'core_infra'\n)\nselect\n  
job_name,\n  job_yml_filename,\n  run_1,\n  run1Id,\n  run_2,\n  run2Id,\n  
run_3,\n  run3Id,\n  run_4,\n  run4Id,\n  run_5,\n  run5Id,\n  run_6
 ,\n  run6Id,\n  run_7,\n  run7Id,\n  run_8,\n  run8Id,\n  run_9,\n  run9Id,\n  
run_10,\n  run10Id\nfrom\n  workflows;",

Review Comment:
   A couple notes on the query:
   
   1) I think you left a test postfix in by mistake (`github_workflows_test` 
instead of `github_workflows` and likewise for `github_workflow_runs_test`)
   2) Why do we need the join in the first part of the query (`INNER JOIN 
github_workflows_test ON github_workflow_runs_test.workflow_id = 
github_workflows_test.workflow_id`) - isn't all of the data we need available 
in `github_workflow_runs`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] Metrics prefetcher script [beam]

Reply via email to