tvalentyn commented on code in PR #25970:
URL: https://github.com/apache/beam/pull/25970#discussion_r1152507294
##########
.test-infra/jenkins/job_PostCommit_Python.groovy:
##########
@@ -42,6 +42,7 @@ ALL_SUPPORTED_VERSIONS.each { pythonVersion ->
rootBuildScriptDir(commonJobProperties.checkoutDir)
tasks(":python${versionSuffix}PostCommit")
commonJobProperties.setGradleSwitches(delegate)
+ switches("-PsdistReuseWheel")
Review Comment:
nit: sdist stands for source distribution, and wheel distribution is the
opposite.
consider: -PuseBinaryDistribution or -PuseWheelDistribution and
`project.ext.sdkLocation` instead of `project.ext.sdistFiles`
##########
buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy:
##########
@@ -2785,15 +2785,70 @@ class BeamModulePlugin implements Plugin<Project> {
// distribution tarball generated by :sdks:python:sdist.
project.configurations { distTarBall }
+ // Create a task to install Beam python SDK locally before running tests
on
+ // Google cloud platform. The task depends on ':sdks:python:sdist'
project
+ // for the tarball file.
+ // For compatible systems (Linux amd64), a 'sdistReuseWheel' system
property
+ // can be provided to re-use the wheel built in this task (by configuring
+ // `--sdk_location` pipeine option in test tasks).
def installGcpTest = project.tasks.register('installGcpTest') {
dependsOn setupVirtualenv
dependsOn ':sdks:python:sdist'
+
+ // if host system wheel compatible with dataflow runner
+ def compatibleWithDataflow =
("amd64".equalsIgnoreCase(System.getProperty("os.arch")) &&
+ "linux".equalsIgnoreCase(System.getProperty("os.name")))
+ if (!compatibleWithDataflow && project.hasProperty('sdistReuseWheel'))
{
+ throw new GradleException('-PsdistReuseWheel is set for the task but
the ' +
+ 'host system is not compatible with Dataflow worker container
image.')
+ }
+
+ // Set sdistFiles project ext at execution time as the path to the
+ // generated installable Python SDK package. If project property
+ // sdistUseWheel is set, targets to wheel, otherwise a tarball.
+ project.ext.sdistFiles = project.files()
+
doLast {
- def distTarBall = "${pythonRootDir}/build/apache-beam.tar.gz"
+ def packageFile = "${pythonRootDir}/build/apache-beam.tar.gz"
+ if (compatibleWithDataflow) {
+ // build wheel in separate folder to avoid racing conditions
+ project.copy {
+ from project.tarTree(project.resources.gzip(packageFile))
+ into project.buildDir
+ }
+ def srcDirs = project.files()
+ project.buildDir.eachDirMatch({it.startsWith('apache-beam')},
{srcDirs.from it})
+ def srcDir = srcDirs.singleFile
+ project.exec {
+ executable 'sh'
+ args '-c', ". ${project.ext.envdir}/bin/activate && pip install
'Cython<1' " +
+ "&& cd ${srcDir} && python setup.py -q sdist bdist_wheel
--dist-dir ${project.buildDir}"
+ }
+ def collection = project.fileTree(project.buildDir){
+ include "**/*${project.ext.pythonVersion.replace('.', '')}*.whl"
+ exclude 'srcs/**'
+ }
+ def packageFilename = collection.singleFile.getName()
+ // rename to the suffix accepted by sdks/python/container/boot.go.
+ def renamed = packageFilename.replace(
Review Comment:
Do we actually have to rename the wheel? I think what matters is that it's
compatible with target platform, not the exact name.
##########
buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy:
##########
@@ -2785,15 +2785,70 @@ class BeamModulePlugin implements Plugin<Project> {
// distribution tarball generated by :sdks:python:sdist.
project.configurations { distTarBall }
+ // Create a task to install Beam python SDK locally before running tests
on
+ // Google cloud platform. The task depends on ':sdks:python:sdist'
project
+ // for the tarball file.
+ // For compatible systems (Linux amd64), a 'sdistReuseWheel' system
property
+ // can be provided to re-use the wheel built in this task (by configuring
+ // `--sdk_location` pipeine option in test tasks).
def installGcpTest = project.tasks.register('installGcpTest') {
dependsOn setupVirtualenv
dependsOn ':sdks:python:sdist'
+
+ // if host system wheel compatible with dataflow runner
+ def compatibleWithDataflow =
("amd64".equalsIgnoreCase(System.getProperty("os.arch")) &&
Review Comment:
Another idea is to try to get the wheel that ci builds for us (in fact, we
build wheel for every PR), but we don't upload wheels to
gs://beam-wheels-staging for in-flight PRs, only for submitted commits. we
could potentially change that and upload or all commits, into a bucket with
lifecycle configured. The wheels that we do upload are tagged by commit ID.
Sounds like more going that route is more complexity though that just building
a wheel when we can. An advantage is: we will use the same wheel we release.
Maybe this will be simpler when we move everything to github actions.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]