This is an automated email from the ASF dual-hosted git repository. yhu pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git
The following commit(s) were added to refs/heads/master by this push: new b855ec58df5 Replace ip (104.154.241.245, 35.193.202.176) with metrics.beam.apache.org (#27945) b855ec58df5 is described below commit b855ec58df5fd8257713597d71022c892856ca56 Author: liferoad <huxiangq...@gmail.com> AuthorDate: Mon Aug 14 09:46:43 2023 -0400 Replace ip (104.154.241.245, 35.193.202.176) with metrics.beam.apache.org (#27945) * replace 104.154.241.245 with 35.193.202.176 * more changes * use metrics.beam.apache.org --------- Co-authored-by: xqhu <x...@google.com> --- .test-infra/metrics/src/test/groovy/ProberTests.groovy | 2 +- sdks/python/apache_beam/testing/analyzers/README.md | 17 ++++++++++------- .../apache_beam/testing/analyzers/tests_config.yaml | 12 ++++++------ 3 files changed, 17 insertions(+), 14 deletions(-) diff --git a/.test-infra/metrics/src/test/groovy/ProberTests.groovy b/.test-infra/metrics/src/test/groovy/ProberTests.groovy index 5a44d4410a9..c5de9ca64c8 100644 --- a/.test-infra/metrics/src/test/groovy/ProberTests.groovy +++ b/.test-infra/metrics/src/test/groovy/ProberTests.groovy @@ -27,7 +27,7 @@ import static groovy.test.GroovyAssert.shouldFail */ class ProberTests { // TODO: Make this configurable - def grafanaEndpoint = 'http://35.193.202.176' + def grafanaEndpoint = 'http://metrics.beam.apache.org' @Test void PingGrafanaHttpApi() { diff --git a/sdks/python/apache_beam/testing/analyzers/README.md b/sdks/python/apache_beam/testing/analyzers/README.md index 71351fe3e57..6098c82fd54 100644 --- a/sdks/python/apache_beam/testing/analyzers/README.md +++ b/sdks/python/apache_beam/testing/analyzers/README.md @@ -19,7 +19,8 @@ # Performance alerts for Beam Python performance and load tests -## Alerts +## Alerts + Performance regressions or improvements detected with the [Change Point Analysis](https://en.wikipedia.org/wiki/Change_detection) using [edivisive](https://github.com/apache/beam/blob/0a91d139dea4276dc46176c4cdcdfce210fc50c4/.test-infra/jenkins/job_InferenceBenchmarkTests_Python.groovy#L30) analyzer are automatically filed as Beam GitHub issues with a label `perf-alert`. @@ -32,7 +33,8 @@ If a performance alert is created on a test, a GitHub issue will be created and URL, issue number along with the change point value and timestamp are exported to BigQuery. This data will be used to analyze the next change point observed on the same test to update already created GitHub issue or ignore performance alert by not creating GitHub issue to avoid duplicate issue creation. -## Config file structure +## Config file structure + The config file defines the structure to run change point analysis on a given test. To add a test to the config file, please follow the below structure. @@ -73,21 +75,22 @@ Sometimes, the change point found might be way back in time and could be irrelev reported only when it was observed in the last 7 runs from the current run, setting `num_runs_in_change_point_window=7` will achieve it. -## Register a test for performance alerts +## Register a test for performance alerts If a new test needs to be registered for the performance alerting tool, please add the required test parameters to the config file. ## Triage performance alert issues -All the performance/load tests metrics defined at [beam/.test-infra/jenkins](https://github.com/apache/beam/tree/master/.test-infra/jenkins) are imported to [Grafana dashboards](http://104.154.241.245/d/1/getting-started?orgId=1) for visualization. Please +All the performance/load tests metrics defined at [beam/.test-infra/jenkins](https://github.com/apache/beam/tree/master/.test-infra/jenkins) are imported to [Grafana dashboards](http://metrics.beam.apache.org/d/1/getting-started?orgId=1) for visualization. Please find the alerted test dashboard to find a spike in the metric values. For example, for the below configuration, -* test_target: `apache_beam.testing.benchmarks.inference.pytorch_image_classification_benchmarks` -* metric_name: `mean_load_model_latency_milli_secs` -Grafana dashboard can be found at http://104.154.241.245/d/ZpS8Uf44z/python-ml-runinference-benchmarks?orgId=1&viewPanel=7 +- test_target: `apache_beam.testing.benchmarks.inference.pytorch_image_classification_benchmarks` +- metric_name: `mean_load_model_latency_milli_secs` + +Grafana dashboard can be found at http://metrics.beam.apache.org/d/ZpS8Uf44z/python-ml-runinference-benchmarks?orgId=1&viewPanel=7 If the dashboard for a test is not found, you can use the notebook `analyze_metric_data.ipynb` to generate a plot for the given test, metric_name. diff --git a/sdks/python/apache_beam/testing/analyzers/tests_config.yaml b/sdks/python/apache_beam/testing/analyzers/tests_config.yaml index e7741db93b0..bc74f292c48 100644 --- a/sdks/python/apache_beam/testing/analyzers/tests_config.yaml +++ b/sdks/python/apache_beam/testing/analyzers/tests_config.yaml @@ -22,7 +22,7 @@ pytorch_image_classification_benchmarks-resnet152-mean_inference_batch_latency_m test_description: Pytorch image classification on 50k images of size 224 x 224 with resnet 152. Test link - https://github.com/apache/beam/blob/42d0a6e3564d8b9c5d912428a6de18fb22a13ac1/.test-infra/jenkins/job_InferenceBenchmarkTests_Python.groovy#L63 - Test dashboard - http://104.154.241.245/d/ZpS8Uf44z/python-ml-runinference-benchmarks?orgId=1&viewPanel=2 + Test dashboard - http://metrics.beam.apache.org/d/ZpS8Uf44z/python-ml-runinference-benchmarks?orgId=1&viewPanel=2 test_target: metrics_dataset: beam_run_inference metrics_table: torch_inference_imagenet_results_resnet152 @@ -33,7 +33,7 @@ pytorch_image_classification_benchmarks-resnet101-mean_load_model_latency_milli_ test_description: Pytorch image classification on 50k images of size 224 x 224 with resnet 101. Test link - https://github.com/apache/beam/blob/42d0a6e3564d8b9c5d912428a6de18fb22a13ac1/.test-infra/jenkins/job_InferenceBenchmarkTests_Python.groovy#L34 - Test dashboard - http://104.154.241.245/d/ZpS8Uf44z/python-ml-runinference-benchmarks?orgId=1&viewPanel=7 + Test dashboard - http://metrics.beam.apache.org/d/ZpS8Uf44z/python-ml-runinference-benchmarks?orgId=1&viewPanel=7 test_target: apache_beam.testing.benchmarks.inference.pytorch_image_classification_benchmarks metrics_dataset: beam_run_inference metrics_table: torch_inference_imagenet_results_resnet101 @@ -44,7 +44,7 @@ pytorch_image_classification_benchmarks-resnet101-mean_inference_batch_latency_m test_description: Pytorch image classification on 50k images of size 224 x 224 with resnet 101. Test link - https://github.com/apache/beam/blob/42d0a6e3564d8b9c5d912428a6de18fb22a13ac1/.test-infra/jenkins/job_InferenceBenchmarkTests_Python.groovy#L34 - Test dashboard - http://104.154.241.245/d/ZpS8Uf44z/python-ml-runinference-benchmarks?orgId=1&viewPanel=2 + Test dashboard - http://metrics.beam.apache.org/d/ZpS8Uf44z/python-ml-runinference-benchmarks?orgId=1&viewPanel=2 test_target: apache_beam.testing.benchmarks.inference.pytorch_image_classification_benchmarks metrics_dataset: beam_run_inference metrics_table: torch_inference_imagenet_results_resnet101 @@ -55,7 +55,7 @@ pytorch_image_classification_benchmarks-resnet152-GPU-mean_inference_batch_laten test_description: Pytorch image classification on 50k images of size 224 x 224 with resnet 152 with Tesla T4 GPU. Test link - https://github.com/apache/beam/blob/42d0a6e3564d8b9c5d912428a6de18fb22a13ac1/.test-infra/jenkins/job_InferenceBenchmarkTests_Python.groovy#L151 - Test dashboard - http://104.154.241.245/d/ZpS8Uf44z/python-ml-runinference-benchmarks?orgId=1&viewPanel=7 + Test dashboard - http://metrics.beam.apache.org/d/ZpS8Uf44z/python-ml-runinference-benchmarks?orgId=1&viewPanel=7 test_target: apache_beam.testing.benchmarks.inference.pytorch_image_classification_benchmarks metrics_dataset: beam_run_inference metrics_table: torch_inference_imagenet_results_resnet101 @@ -66,7 +66,7 @@ pytorch_image_classification_benchmarks-resnet152-GPU-mean_load_model_latency_mi test_description: Pytorch image classification on 50k images of size 224 x 224 with resnet 152 with Tesla T4 GPU. Test link - https://github.com/apache/beam/blob/42d0a6e3564d8b9c5d912428a6de18fb22a13ac1/.test-infra/jenkins/job_InferenceBenchmarkTests_Python.groovy#L151 - Test dashboard - http://104.154.241.245/d/ZpS8Uf44z/python-ml-runinference-benchmarks?orgId=1&viewPanel=7 + Test dashboard - http://metrics.beam.apache.org/d/ZpS8Uf44z/python-ml-runinference-benchmarks?orgId=1&viewPanel=7 test_target: apache_beam.testing.benchmarks.inference.pytorch_image_classification_benchmarks metrics_dataset: beam_run_inference metrics_table: torch_inference_imagenet_results_resnet152_tesla_t4 @@ -77,7 +77,7 @@ pytorch_image_classification_benchmarks-resnet152-GPU-mean_inference_batch_laten test_description: Pytorch image classification on 50k images of size 224 x 224 with resnet 152 with Tesla T4 GPU. Test link - https://github.com/apache/beam/blob/42d0a6e3564d8b9c5d912428a6de18fb22a13ac1/.test-infra/jenkins/job_InferenceBenchmarkTests_Python.groovy#L151). - Test dashboard - http://104.154.241.245/d/ZpS8Uf44z/python-ml-runinference-benchmarks?from=now-90d&to=now&viewPanel=2 + Test dashboard - http://metrics.beam.apache.org/d/ZpS8Uf44z/python-ml-runinference-benchmarks?from=now-90d&to=now&viewPanel=2 test_target: apache_beam.testing.benchmarks.inference.pytorch_image_classification_benchmarks metrics_dataset: beam_run_inference metrics_table: torch_inference_imagenet_results_resnet152_tesla_t4