tvalentyn commented on a change in pull request #12452: URL: https://github.com/apache/beam/pull/12452#discussion_r468768265
########## File path: CI.md ########## @@ -75,8 +75,28 @@ run categories. Here is a summary of the run categories with regards of the jobs Those jobs often have matrix run strategy which runs several different variations of the jobs (with different platform type / Python version to run for example) +### Google Cloud Platform Credentials + +Some of the jobs require variables stored as a [GitHub Secrets](https://docs.github.com/en/actions/configuring-and-managing-workflows/creating-and-storing-encrypted-secrets) +to perform operations on Google Cloud Platform. Currently these jobs are limited to Apache repository only. +These variables are: + * `GCP_PROJECT_ID` - ID of the Google Cloud project. Apache/Beam repository has it set to `apache-beam-testing`. Review comment: For my education, which repository we refer to here? Also, I would not hard code existing values in documentation, since the source-of-truth may be updated, and documentation may not. You could say: `For example apache-beam-testing` instead. ########## File path: CI.md ########## @@ -75,8 +75,28 @@ run categories. Here is a summary of the run categories with regards of the jobs Those jobs often have matrix run strategy which runs several different variations of the jobs (with different platform type / Python version to run for example) +### Google Cloud Platform Credentials + +Some of the jobs require variables stored as a [GitHub Secrets](https://docs.github.com/en/actions/configuring-and-managing-workflows/creating-and-storing-encrypted-secrets) +to perform operations on Google Cloud Platform. Currently these jobs are limited to Apache repository only. +These variables are: + * `GCP_PROJECT_ID` - ID of the Google Cloud project. Apache/Beam repository has it set to `apache-beam-testing`. + * `GCP_REGION` - Region of the bucket and dataflow jobs. Apache/Beam repository has it set to `us-central1`. + * `GCP_TESTING_BUCKET` - Name of the bucket where temporary files for Dataflow tests will be stored. Apache/Beam repository has it set to `beam-github-actions-tests`. + * `GCP_SA_EMAIL` - Service account email address. This is usually of the format `<name>@<project-id>.iam.gserviceaccount.com`. + * `GCP_SA_KEY` - Service account key. This key should be created and encoded as a Base64 string (eg. `cat my-key.json | base64` on macOS). + +Service Account shall have following permissions: + * Storage Admin (roles/storage.admin) + * Dataflow Admin (roles/dataflow.admin) + +### Workflows + +#### Build python source distribution and wheels - [build_wheels.yml](.github/workflows/build_wheels.yml) + | Job | Description | Pull Request Run | Direct Push/Merge Run | Scheduled Run | Requires GCP Credentials | |-------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------|-----------------------|---------------|--------------------------| +| Check are GCP variables set | Checks are GCP variables are set. Jobs which required them depends on the output of this job. | Yes | Yes | Yes | Yes/No | Review comment: Suggestion: Consider using `Checks that GCP variables are set` or `Check GCP variables` throughout. ########## File path: CI.md ########## @@ -85,16 +105,15 @@ Those jobs often have matrix run strategy which runs several different variation | List files on Google Cloud Storage Bucket | Lists files on GCS for verification purpose. | - | Yes | Yes | Yes | | Tag repo nightly | Tag repo with `nightly-master` tag if build python source distribution and python wheels finished successfully. | - | - | Yes | - | -### Google Cloud Platform Credentials - -Some of the jobs require variables stored as a [GitHub Secrets](https://docs.github.com/en/actions/configuring-and-managing-workflows/creating-and-storing-encrypted-secrets) -to perform operations on Google Cloud Platform. Currently these jobs are limited to Apache repository only. -These variables are: - * `GCP_SA_EMAIL` - Service account email address. This is usually of the format `<name>@<project-id>.iam.gserviceaccount.com`. - * `GCP_SA_KEY` - Service account key. This key should be created and encoded as a Base64 string (eg. `cat my-key.json | base64` on macOS). +#### Python tests - [python_tests.yml](.github/workflows/python_tests.yml) -Service Account shall have following permissions: - * Storage Object Admin (roles/storage.objectAdmin) +| Job | Description | Pull Request Run | Direct Push/Merge Run | Scheduled Run | Requires GCP Credentials | +|----------------------------------|-----------------------------------------------------------------------------------------------------------------------|------------------|-----------------------|---------------|--------------------------| +| Check are GCP variables set | Checks are GCP variables are set. Jobs which required them depends on the output of this job. | Yes | Yes | Yes | Yes/No | Review comment: nit: depend ########## File path: CI.md ########## @@ -75,8 +75,28 @@ run categories. Here is a summary of the run categories with regards of the jobs Those jobs often have matrix run strategy which runs several different variations of the jobs (with different platform type / Python version to run for example) +### Google Cloud Platform Credentials + +Some of the jobs require variables stored as a [GitHub Secrets](https://docs.github.com/en/actions/configuring-and-managing-workflows/creating-and-storing-encrypted-secrets) Review comment: nit: as [GitHub Secrets] ########## File path: sdks/python/apache_beam/testing/util.py ########## @@ -334,3 +335,19 @@ def open_shards(glob_pattern, mode='rt', encoding='utf-8'): out_file.write(in_file.read()) concatenated_file_name = out_file.name return io.open(concatenated_file_name, mode, encoding=encoding) + + +class TemporaryDirectory(object): Review comment: Thanks. I forgot we are still on Py2, but very soon we can start removing Py2 support. We actually defined this at least twice already. https://github.com/apache/beam/blob/0098eb6f9900849b05bfc2ec2497ed8c7d37148a/sdks/python/apache_beam/utils/subprocess_server_test.py#L39 https://github.com/apache/beam/blob/0098eb6f9900849b05bfc2ec2497ed8c7d37148a/sdks/python/apache_beam/testing/test_utils.py#L43 Feel free to reuse one of them and add a comment # TODO(BEAM-8371): Use tempfile.TemporaryDirectory. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
