Hi Python SDK Developers,

You may notice that Gradle files changed a lot recently as parallelization
<https://guides.gradle.org/performance/#parallel_execution> applied to
Python tests and more python versions were enabled in testing. There are
tricks over the build scripts and tests are grown naturally and distributed
under sdks/python, which caused frictions (like rollback PR-8059
<https://github.com/apache/beam/pull/8059>).

Thus, I created BEAM-6907 <https://issues.apache.org/jira/browse/BEAM-6907> and
would like to initiate some works to cleanup and standardize Gradle
structure in Python SDK. In general, I think we want to:

- Apply parallel execution
- Share common tasks
- Centralize test related tasks
- Have a clear Gradle structure for projects/tasks

This is Gradle directory structure I proposed:

sdks/python/

build.gradle    --> hold builds, snapshot, analytic tasks
test-suites/    --> all pre/post/VR test suites under here

README.md

dataflow/    --> grouped by runner or unit test (tox)

py27/    --> grouped by py version

build.gradle

py35/

...

direct/

py27/

...

flink/

tox/
...


The ideas are:
- Only keep builds, snapshot and analytic jobs in sdks/python/build.gradle
- Move all test related tasks to sdks/python/test-suites/
- In sdks/python/test-suites, we first group by runners, unit test or other
testing that can't fit to them, and then group by py versions if needed.
- An example of ../test-suites/../py35/build.gradle is this
<https://github.com/apache/beam/blob/master/sdks/python/test-suites/dataflow/py3/build.gradle>
.

Please feel free to explore existing Gradle scripts in Python SDK and bring
any thoughts on this proposal if you have.

Thanks!
Mark

Reply via email to