[GitHub] [beam] piotr-szuberski commented on a change in pull request #11661: [BEAM-7774] Remove perfkit benchmarking tool from python performance …
piotr-szuberski commented on a change in pull request #11661: URL: https://github.com/apache/beam/pull/11661#discussion_r432409480 ## File path: .test-infra/metrics/grafana/dashboards/perftests_metrics/Python_Performance_Tests.json ## @@ -77,7 +77,7 @@ ], "orderByTime": "ASC", "policy": "default", - "query": "SELECT mean(\"value\") FROM \"wordcount_py27_results\" WHERE metric = 'Python performance test' AND $timeFilter GROUP BY time($__interval), \"metric\"", + "query": "SELECT mean(\"value\") FROM \"wordcount_py27_results\" WHERE metric = 'wordcount_it_runtime' AND $timeFilter GROUP BY time($__interval), \"metric\"", Review comment: I changed it to 'runtime' This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] piotr-szuberski commented on a change in pull request #11661: [BEAM-7774] Remove perfkit benchmarking tool from python performance …
piotr-szuberski commented on a change in pull request #11661: URL: https://github.com/apache/beam/pull/11661#discussion_r432061809 ## File path: sdks/python/apache_beam/examples/wordcount_it_test.py ## @@ -84,11 +87,45 @@ def _run_wordcount_it(self, run_wordcount, **opts): # Register clean up before pipeline execution self.addCleanup(delete_files, [test_output + '*']) +publish_to_bq = bool( +test_pipeline.get_option('publish_to_big_query') or False) + +# Start measure time for performance test +start_time = time.time() + # Get pipeline options from command argument: --test-pipeline-options, # and start pipeline job by calling pipeline main function. run_wordcount( test_pipeline.get_full_options_as_args(**extra_opts), -save_main_session=False) +save_main_session=False, +) + +end_time = time.time() +run_time = end_time - start_time + +if publish_to_bq: + self._publish_metrics(test_pipeline, run_time) + + def _publish_metrics(self, pipeline, metric_value): +influx_options = InfluxDBMetricsPublisherOptions( +pipeline.get_option('influx_measurement'), +pipeline.get_option('influx_db_name'), +pipeline.get_option('influx_hostname'), +os.getenv('INFLUXDB_USER'), +os.getenv('INFLUXDB_USER_PASSWORD'), +) +metric_reader = MetricsReader( +project_name=pipeline.get_option('project'), +bq_table=pipeline.get_option('metrics_table'), +bq_dataset=pipeline.get_option('metrics_dataset'), +publish_to_bq=True, +influxdb_options=influx_options, +) + +metric_reader.publish_values(( +metric_value, Review comment: Good point, I changed it to wordcount_it_runtime and the order of key, value. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] piotr-szuberski commented on a change in pull request #11661: [BEAM-7774] Remove perfkit benchmarking tool from python performance …
piotr-szuberski commented on a change in pull request #11661: URL: https://github.com/apache/beam/pull/11661#discussion_r432061809 ## File path: sdks/python/apache_beam/examples/wordcount_it_test.py ## @@ -84,11 +87,45 @@ def _run_wordcount_it(self, run_wordcount, **opts): # Register clean up before pipeline execution self.addCleanup(delete_files, [test_output + '*']) +publish_to_bq = bool( +test_pipeline.get_option('publish_to_big_query') or False) + +# Start measure time for performance test +start_time = time.time() + # Get pipeline options from command argument: --test-pipeline-options, # and start pipeline job by calling pipeline main function. run_wordcount( test_pipeline.get_full_options_as_args(**extra_opts), -save_main_session=False) +save_main_session=False, +) + +end_time = time.time() +run_time = end_time - start_time + +if publish_to_bq: + self._publish_metrics(test_pipeline, run_time) + + def _publish_metrics(self, pipeline, metric_value): +influx_options = InfluxDBMetricsPublisherOptions( +pipeline.get_option('influx_measurement'), +pipeline.get_option('influx_db_name'), +pipeline.get_option('influx_hostname'), +os.getenv('INFLUXDB_USER'), +os.getenv('INFLUXDB_USER_PASSWORD'), +) +metric_reader = MetricsReader( +project_name=pipeline.get_option('project'), +bq_table=pipeline.get_option('metrics_table'), +bq_dataset=pipeline.get_option('metrics_dataset'), +publish_to_bq=True, +influxdb_options=influx_options, +) + +metric_reader.publish_values(( +metric_value, Review comment: Good point. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] piotr-szuberski commented on a change in pull request #11661: [BEAM-7774] Remove perfkit benchmarking tool from python performance …
piotr-szuberski commented on a change in pull request #11661: URL: https://github.com/apache/beam/pull/11661#discussion_r431143884 ## File path: .test-infra/metrics/grafana/dashboards/perftests_metrics/Python_Performance_Tests.json ## @@ -0,0 +1,297 @@ +{ Review comment: Python WordCount IT Benchmarks definitely sounds better. BTW, there was a typo WorldCount instead of WordCount :D This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] piotr-szuberski commented on a change in pull request #11661: [BEAM-7774] Remove perfkit benchmarking tool from python performance …
piotr-szuberski commented on a change in pull request #11661: URL: https://github.com/apache/beam/pull/11661#discussion_r431143884 ## File path: .test-infra/metrics/grafana/dashboards/perftests_metrics/Python_Performance_Tests.json ## @@ -0,0 +1,297 @@ +{ Review comment: Python WordCount IT Benchmarks definitely sounds better. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] piotr-szuberski commented on a change in pull request #11661: [BEAM-7774] Remove perfkit benchmarking tool from python performance …
piotr-szuberski commented on a change in pull request #11661: URL: https://github.com/apache/beam/pull/11661#discussion_r431142517 ## File path: sdks/python/apache_beam/examples/wordcount_it_test.py ## @@ -104,18 +107,33 @@ def _run_wordcount_it(self, run_wordcount, **opts): run_time = end_time - start_time if publish_to_bq: - bq_publisher = BigQueryMetricsPublisher( - project_name=test_pipeline.get_option('project'), - table=test_pipeline.get_option('metrics_table'), - dataset=test_pipeline.get_option('metrics_dataset'), - ) - result = Metric( - submit_timestamp=time.time(), - metric_id=uuid.uuid4().hex, - value=run_time, - label='Python performance test', - ) - bq_publisher.publish([result.as_dict()]) + self._publish_metrics(test_pipeline, run_time) + + def _publish_metrics(self, pipeline, metric_value): Review comment: I think that's a very good idea. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] piotr-szuberski commented on a change in pull request #11661: [BEAM-7774] Remove perfkit benchmarking tool from python performance …
piotr-szuberski commented on a change in pull request #11661: URL: https://github.com/apache/beam/pull/11661#discussion_r431142517 ## File path: sdks/python/apache_beam/examples/wordcount_it_test.py ## @@ -104,18 +107,33 @@ def _run_wordcount_it(self, run_wordcount, **opts): run_time = end_time - start_time if publish_to_bq: - bq_publisher = BigQueryMetricsPublisher( - project_name=test_pipeline.get_option('project'), - table=test_pipeline.get_option('metrics_table'), - dataset=test_pipeline.get_option('metrics_dataset'), - ) - result = Metric( - submit_timestamp=time.time(), - metric_id=uuid.uuid4().hex, - value=run_time, - label='Python performance test', - ) - bq_publisher.publish([result.as_dict()]) + self._publish_metrics(test_pipeline, run_time) + + def _publish_metrics(self, pipeline, metric_value): Review comment: I think that's a very good idea to add a method to the MetricsReader. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] piotr-szuberski commented on a change in pull request #11661: [BEAM-7774] Remove perfkit benchmarking tool from python performance …
piotr-szuberski commented on a change in pull request #11661: URL: https://github.com/apache/beam/pull/11661#discussion_r430935474 ## File path: sdks/python/apache_beam/examples/wordcount_it_test.py ## @@ -104,18 +107,33 @@ def _run_wordcount_it(self, run_wordcount, **opts): run_time = end_time - start_time if publish_to_bq: - bq_publisher = BigQueryMetricsPublisher( - project_name=test_pipeline.get_option('project'), - table=test_pipeline.get_option('metrics_table'), - dataset=test_pipeline.get_option('metrics_dataset'), - ) - result = Metric( - submit_timestamp=time.time(), - metric_id=uuid.uuid4().hex, - value=run_time, - label='Python performance test', - ) - bq_publisher.publish([result.as_dict()]) + self._publish_metrics(test_pipeline, run_time) + + def _publish_metrics(self, pipeline, metric_value): Review comment: I can add something like "publish_single_value_(bq/influx/console)" for each publisher This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] piotr-szuberski commented on a change in pull request #11661: [BEAM-7774] Remove perfkit benchmarking tool from python performance …
piotr-szuberski commented on a change in pull request #11661: URL: https://github.com/apache/beam/pull/11661#discussion_r430935474 ## File path: sdks/python/apache_beam/examples/wordcount_it_test.py ## @@ -104,18 +107,33 @@ def _run_wordcount_it(self, run_wordcount, **opts): run_time = end_time - start_time if publish_to_bq: - bq_publisher = BigQueryMetricsPublisher( - project_name=test_pipeline.get_option('project'), - table=test_pipeline.get_option('metrics_table'), - dataset=test_pipeline.get_option('metrics_dataset'), - ) - result = Metric( - submit_timestamp=time.time(), - metric_id=uuid.uuid4().hex, - value=run_time, - label='Python performance test', - ) - bq_publisher.publish([result.as_dict()]) + self._publish_metrics(test_pipeline, run_time) + + def _publish_metrics(self, pipeline, metric_value): Review comment: I can add something like "publish_single_value(bq/influx/console)" for each publisher This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] piotr-szuberski commented on a change in pull request #11661: [BEAM-7774] Remove perfkit benchmarking tool from python performance …
piotr-szuberski commented on a change in pull request #11661: URL: https://github.com/apache/beam/pull/11661#discussion_r429088765 ## File path: .test-infra/jenkins/job_PerformanceTests_Python.groovy ## @@ -58,117 +26,59 @@ def dataflowPipelineArgs = [ temp_location : 'gs://temp-storage-for-end-to-end-tests/temp-it', ] - -// Configurations of each Jenkins job. -def testConfigurations = [ -new PerformanceTestConfigurations( -jobName : 'beam_PerformanceTests_WordCountIT_Py27', -jobDescription: 'Python SDK Performance Test - Run WordCountIT in Py27 with 1Gb files', -jobTriggerPhrase : 'Run Python27 WordCountIT Performance Test', -resultTable : 'beam_performance.wordcount_py27_pkb_results', -test : 'apache_beam.examples.wordcount_it_test:WordCountIT.test_wordcount_it', -itModule : ':sdks:python:test-suites:dataflow:py2', -extraPipelineArgs : dataflowPipelineArgs + [ -input: 'gs://apache-beam-samples/input_small_files/ascii_sort_1MB_input.*', // 1Gb -output: 'gs://temp-storage-for-end-to-end-tests/py-it-cloud/output', -expect_checksum: 'ea0ca2e5ee4ea5f218790f28d0b9fe7d09d8d710', -num_workers: '10', -autoscaling_algorithm: 'NONE', // Disable autoscale the worker pool. -], -), -new PerformanceTestConfigurations( -jobName : 'beam_PerformanceTests_WordCountIT_Py35', -jobDescription: 'Python SDK Performance Test - Run WordCountIT in Py35 with 1Gb files', -jobTriggerPhrase : 'Run Python35 WordCountIT Performance Test', -resultTable : 'beam_performance.wordcount_py35_pkb_results', -test : 'apache_beam.examples.wordcount_it_test:WordCountIT.test_wordcount_it', -itModule : ':sdks:python:test-suites:dataflow:py35', -extraPipelineArgs : dataflowPipelineArgs + [ -input: 'gs://apache-beam-samples/input_small_files/ascii_sort_1MB_input.*', // 1Gb -output: 'gs://temp-storage-for-end-to-end-tests/py-it-cloud/output', -expect_checksum: 'ea0ca2e5ee4ea5f218790f28d0b9fe7d09d8d710', -num_workers: '10', -autoscaling_algorithm: 'NONE', // Disable autoscale the worker pool. -], -), -new PerformanceTestConfigurations( -jobName : 'beam_PerformanceTests_WordCountIT_Py36', -jobDescription: 'Python SDK Performance Test - Run WordCountIT in Py36 with 1Gb files', -jobTriggerPhrase : 'Run Python36 WordCountIT Performance Test', -resultTable : 'beam_performance.wordcount_py36_pkb_results', -test : 'apache_beam.examples.wordcount_it_test:WordCountIT.test_wordcount_it', -itModule : ':sdks:python:test-suites:dataflow:py36', -extraPipelineArgs : dataflowPipelineArgs + [ -input: 'gs://apache-beam-samples/input_small_files/ascii_sort_1MB_input.*', // 1Gb -output: 'gs://temp-storage-for-end-to-end-tests/py-it-cloud/output', -expect_checksum: 'ea0ca2e5ee4ea5f218790f28d0b9fe7d09d8d710', -num_workers: '10', -autoscaling_algorithm: 'NONE', // Disable autoscale the worker pool. -], -), -new PerformanceTestConfigurations( -jobName : 'beam_PerformanceTests_WordCountIT_Py37', -jobDescription: 'Python SDK Performance Test - Run WordCountIT in Py37 with 1Gb files', -jobTriggerPhrase : 'Run Python37 WordCountIT Performance Test', -resultTable : 'beam_performance.wordcount_py37_pkb_results', -test : 'apache_beam.examples.wordcount_it_test:WordCountIT.test_wordcount_it', -itModule : ':sdks:python:test-suites:dataflow:py37', -extraPipelineArgs : dataflowPipelineArgs + [ -input: 'gs://apache-beam-samples/input_small_files/ascii_sort_1MB_input.*', // 1Gb -output: 'gs://temp-storage-for-end-to-end-tests/py-it-cloud/output', -expect_checksum: 'ea0ca2e5ee4ea5f218790f28d0b9fe7d09d8d710', -num_workers: '10', -autoscaling_algorithm: 'NONE', // Disable autoscale the worker pool. -], -), -] - +testConfigurations = [] +pythonVersions = ['27', '35', '36', '37'] + +for (pythonVersion in pythonVersions) { Review comment: I'm not sure if I understand meaning of "dashboards" correctly here, but we are running tasks via proper python modules that have pythonVersion variable already set up, so there is no need to set -PpythonVersion manually. Edit: As far as I know there is no dashboard for those tests yet, we just publish the metrics results in BQ. I could add reporting to InfluxDB and draw grafana dashboards for this test, WDYT? ## File path: .test-infra/jenkins/job_PerformanceTests_Python.groovy
[GitHub] [beam] piotr-szuberski commented on a change in pull request #11661: [BEAM-7774] Remove perfkit benchmarking tool from python performance …
piotr-szuberski commented on a change in pull request #11661: URL: https://github.com/apache/beam/pull/11661#discussion_r429095714 ## File path: sdks/python/test-suites/dataflow/common.gradle ## @@ -109,4 +109,21 @@ task validatesRunnerStreamingTests { args '-c', ". ${envdir}/bin/activate && ${runScriptsDir}/run_integration_test.sh $cmdArgs" } } -} \ No newline at end of file +} + +task runPerformanceTest { +dependsOn 'installGcpTest' +dependsOn ':sdks:python:sdist' + +def test = project.findProperty('test') +def testOpts = project.findProperty('test-pipeline-options') +testOpts += " --sdk_location=${files(configurations.distTarBall.files).singleFile}" + + doLast { +exec { + workingDir "${project.rootDir}/sdks/python" + executable 'sh' + args '-c', ". ${envdir}/bin/activate && ${envdir}/bin/python setup.py nosetests --tests=${test} --test-pipeline-options=\"${testOpts}\" --ignore-files \'.*py3\\d?\\.py\$\'" Review comment: I wanted to be on the safe side because in most places (not only in the bash scripts) it is added. But when I think about it we absolutely don't need to ignore anything, we run just one test. I'll remove it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] piotr-szuberski commented on a change in pull request #11661: [BEAM-7774] Remove perfkit benchmarking tool from python performance …
piotr-szuberski commented on a change in pull request #11661: URL: https://github.com/apache/beam/pull/11661#discussion_r429094648 ## File path: sdks/python/test-suites/dataflow/common.gradle ## @@ -109,4 +109,21 @@ task validatesRunnerStreamingTests { args '-c', ". ${envdir}/bin/activate && ${runScriptsDir}/run_integration_test.sh $cmdArgs" } } -} \ No newline at end of file +} + +task runPerformanceTest { +dependsOn 'installGcpTest' +dependsOn ':sdks:python:sdist' + +def test = project.findProperty('test') +def testOpts = project.findProperty('test-pipeline-options') +testOpts += " --sdk_location=${files(configurations.distTarBall.files).singleFile}" + + doLast { +exec { + workingDir "${project.rootDir}/sdks/python" + executable 'sh' + args '-c', ". ${envdir}/bin/activate && ${envdir}/bin/python setup.py nosetests --tests=${test} --test-pipeline-options=\"${testOpts}\" --ignore-files \'.*py3\\d?\\.py\$\'" Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] piotr-szuberski commented on a change in pull request #11661: [BEAM-7774] Remove perfkit benchmarking tool from python performance …
piotr-szuberski commented on a change in pull request #11661: URL: https://github.com/apache/beam/pull/11661#discussion_r429088765 ## File path: .test-infra/jenkins/job_PerformanceTests_Python.groovy ## @@ -58,117 +26,59 @@ def dataflowPipelineArgs = [ temp_location : 'gs://temp-storage-for-end-to-end-tests/temp-it', ] - -// Configurations of each Jenkins job. -def testConfigurations = [ -new PerformanceTestConfigurations( -jobName : 'beam_PerformanceTests_WordCountIT_Py27', -jobDescription: 'Python SDK Performance Test - Run WordCountIT in Py27 with 1Gb files', -jobTriggerPhrase : 'Run Python27 WordCountIT Performance Test', -resultTable : 'beam_performance.wordcount_py27_pkb_results', -test : 'apache_beam.examples.wordcount_it_test:WordCountIT.test_wordcount_it', -itModule : ':sdks:python:test-suites:dataflow:py2', -extraPipelineArgs : dataflowPipelineArgs + [ -input: 'gs://apache-beam-samples/input_small_files/ascii_sort_1MB_input.*', // 1Gb -output: 'gs://temp-storage-for-end-to-end-tests/py-it-cloud/output', -expect_checksum: 'ea0ca2e5ee4ea5f218790f28d0b9fe7d09d8d710', -num_workers: '10', -autoscaling_algorithm: 'NONE', // Disable autoscale the worker pool. -], -), -new PerformanceTestConfigurations( -jobName : 'beam_PerformanceTests_WordCountIT_Py35', -jobDescription: 'Python SDK Performance Test - Run WordCountIT in Py35 with 1Gb files', -jobTriggerPhrase : 'Run Python35 WordCountIT Performance Test', -resultTable : 'beam_performance.wordcount_py35_pkb_results', -test : 'apache_beam.examples.wordcount_it_test:WordCountIT.test_wordcount_it', -itModule : ':sdks:python:test-suites:dataflow:py35', -extraPipelineArgs : dataflowPipelineArgs + [ -input: 'gs://apache-beam-samples/input_small_files/ascii_sort_1MB_input.*', // 1Gb -output: 'gs://temp-storage-for-end-to-end-tests/py-it-cloud/output', -expect_checksum: 'ea0ca2e5ee4ea5f218790f28d0b9fe7d09d8d710', -num_workers: '10', -autoscaling_algorithm: 'NONE', // Disable autoscale the worker pool. -], -), -new PerformanceTestConfigurations( -jobName : 'beam_PerformanceTests_WordCountIT_Py36', -jobDescription: 'Python SDK Performance Test - Run WordCountIT in Py36 with 1Gb files', -jobTriggerPhrase : 'Run Python36 WordCountIT Performance Test', -resultTable : 'beam_performance.wordcount_py36_pkb_results', -test : 'apache_beam.examples.wordcount_it_test:WordCountIT.test_wordcount_it', -itModule : ':sdks:python:test-suites:dataflow:py36', -extraPipelineArgs : dataflowPipelineArgs + [ -input: 'gs://apache-beam-samples/input_small_files/ascii_sort_1MB_input.*', // 1Gb -output: 'gs://temp-storage-for-end-to-end-tests/py-it-cloud/output', -expect_checksum: 'ea0ca2e5ee4ea5f218790f28d0b9fe7d09d8d710', -num_workers: '10', -autoscaling_algorithm: 'NONE', // Disable autoscale the worker pool. -], -), -new PerformanceTestConfigurations( -jobName : 'beam_PerformanceTests_WordCountIT_Py37', -jobDescription: 'Python SDK Performance Test - Run WordCountIT in Py37 with 1Gb files', -jobTriggerPhrase : 'Run Python37 WordCountIT Performance Test', -resultTable : 'beam_performance.wordcount_py37_pkb_results', -test : 'apache_beam.examples.wordcount_it_test:WordCountIT.test_wordcount_it', -itModule : ':sdks:python:test-suites:dataflow:py37', -extraPipelineArgs : dataflowPipelineArgs + [ -input: 'gs://apache-beam-samples/input_small_files/ascii_sort_1MB_input.*', // 1Gb -output: 'gs://temp-storage-for-end-to-end-tests/py-it-cloud/output', -expect_checksum: 'ea0ca2e5ee4ea5f218790f28d0b9fe7d09d8d710', -num_workers: '10', -autoscaling_algorithm: 'NONE', // Disable autoscale the worker pool. -], -), -] - +testConfigurations = [] +pythonVersions = ['27', '35', '36', '37'] + +for (pythonVersion in pythonVersions) { Review comment: I'm not sure if I understand meaning of "dashboards" correctly here, but we are running tasks via proper python modules that have pythonVersion variable already set up, so there is no need to set -PpythonVersion manually. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please
[GitHub] [beam] piotr-szuberski commented on a change in pull request #11661: [BEAM-7774] Remove perfkit benchmarking tool from python performance …
piotr-szuberski commented on a change in pull request #11661: URL: https://github.com/apache/beam/pull/11661#discussion_r429088004 ## File path: .test-infra/jenkins/job_PerformanceTests_Python.groovy ## @@ -58,117 +26,59 @@ def dataflowPipelineArgs = [ temp_location : 'gs://temp-storage-for-end-to-end-tests/temp-it', ] - -// Configurations of each Jenkins job. -def testConfigurations = [ -new PerformanceTestConfigurations( -jobName : 'beam_PerformanceTests_WordCountIT_Py27', -jobDescription: 'Python SDK Performance Test - Run WordCountIT in Py27 with 1Gb files', -jobTriggerPhrase : 'Run Python27 WordCountIT Performance Test', -resultTable : 'beam_performance.wordcount_py27_pkb_results', -test : 'apache_beam.examples.wordcount_it_test:WordCountIT.test_wordcount_it', -itModule : ':sdks:python:test-suites:dataflow:py2', -extraPipelineArgs : dataflowPipelineArgs + [ -input: 'gs://apache-beam-samples/input_small_files/ascii_sort_1MB_input.*', // 1Gb -output: 'gs://temp-storage-for-end-to-end-tests/py-it-cloud/output', -expect_checksum: 'ea0ca2e5ee4ea5f218790f28d0b9fe7d09d8d710', -num_workers: '10', -autoscaling_algorithm: 'NONE', // Disable autoscale the worker pool. -], -), -new PerformanceTestConfigurations( -jobName : 'beam_PerformanceTests_WordCountIT_Py35', -jobDescription: 'Python SDK Performance Test - Run WordCountIT in Py35 with 1Gb files', -jobTriggerPhrase : 'Run Python35 WordCountIT Performance Test', -resultTable : 'beam_performance.wordcount_py35_pkb_results', -test : 'apache_beam.examples.wordcount_it_test:WordCountIT.test_wordcount_it', -itModule : ':sdks:python:test-suites:dataflow:py35', -extraPipelineArgs : dataflowPipelineArgs + [ -input: 'gs://apache-beam-samples/input_small_files/ascii_sort_1MB_input.*', // 1Gb -output: 'gs://temp-storage-for-end-to-end-tests/py-it-cloud/output', -expect_checksum: 'ea0ca2e5ee4ea5f218790f28d0b9fe7d09d8d710', -num_workers: '10', -autoscaling_algorithm: 'NONE', // Disable autoscale the worker pool. -], -), -new PerformanceTestConfigurations( -jobName : 'beam_PerformanceTests_WordCountIT_Py36', -jobDescription: 'Python SDK Performance Test - Run WordCountIT in Py36 with 1Gb files', -jobTriggerPhrase : 'Run Python36 WordCountIT Performance Test', -resultTable : 'beam_performance.wordcount_py36_pkb_results', -test : 'apache_beam.examples.wordcount_it_test:WordCountIT.test_wordcount_it', -itModule : ':sdks:python:test-suites:dataflow:py36', -extraPipelineArgs : dataflowPipelineArgs + [ -input: 'gs://apache-beam-samples/input_small_files/ascii_sort_1MB_input.*', // 1Gb -output: 'gs://temp-storage-for-end-to-end-tests/py-it-cloud/output', -expect_checksum: 'ea0ca2e5ee4ea5f218790f28d0b9fe7d09d8d710', -num_workers: '10', -autoscaling_algorithm: 'NONE', // Disable autoscale the worker pool. -], -), -new PerformanceTestConfigurations( -jobName : 'beam_PerformanceTests_WordCountIT_Py37', -jobDescription: 'Python SDK Performance Test - Run WordCountIT in Py37 with 1Gb files', -jobTriggerPhrase : 'Run Python37 WordCountIT Performance Test', -resultTable : 'beam_performance.wordcount_py37_pkb_results', -test : 'apache_beam.examples.wordcount_it_test:WordCountIT.test_wordcount_it', -itModule : ':sdks:python:test-suites:dataflow:py37', -extraPipelineArgs : dataflowPipelineArgs + [ -input: 'gs://apache-beam-samples/input_small_files/ascii_sort_1MB_input.*', // 1Gb -output: 'gs://temp-storage-for-end-to-end-tests/py-it-cloud/output', -expect_checksum: 'ea0ca2e5ee4ea5f218790f28d0b9fe7d09d8d710', -num_workers: '10', -autoscaling_algorithm: 'NONE', // Disable autoscale the worker pool. -], -), -] - +testConfigurations = [] +pythonVersions = ['27', '35', '36', '37'] Review comment: I tried to keep the effect of the job as close to original as possible. I agree that 2 versions of python sound sufficient. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] piotr-szuberski commented on a change in pull request #11661: [BEAM-7774] Remove perfkit benchmarking tool from python performance …
piotr-szuberski commented on a change in pull request #11661: URL: https://github.com/apache/beam/pull/11661#discussion_r429087219 ## File path: sdks/python/test-suites/dataflow/py2/build.gradle ## @@ -205,3 +205,20 @@ task chicagoTaxiExample { } } } + +task runPerformanceTest { Review comment: Sure, there can even be more code moved to common.gradle This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org